Converted Notebook
Converted from IPYNB format
code cell
In [1]:
library(ArchR)
library(parallel)
library(ggplot2)
library(dplyr)
Output:
/ |
/ \
. / |.
\\\ / |.
\\\ / `|.
\\\ / |.
\ / |\
\\#####\ / ||
==###########> / ||
\\##==......\ / ||
______ = =|__ /__ || \\\
,--' ,----`-,__ ___/' --,-`-===================##========>
\ ' ##_______ _____ ,--,__,=##,__ ///
, __== ___,-,__,--'#' ===' `-' | ##,-/
-,____,---' \\####\\________________,--\\_##,/
___ .______ ______ __ __ .______
/ \ | _ \ / || | | | | _ \
/ ^ \ | |_) | | ,----'| |__| | | |_) |
/ /_\ \ | / | | | __ | | /
/ _____ \ | |\ \\___ | `----.| | | | | |\ \\___.
/__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
ArchR : Version 1.0.3
For more information see our website : www.ArchRProject.com
If you encounter a bug please report : https://github.com/GreenleafLab/ArchR/issues
Loading Required Packages...
Loading Package : devtools v2.4.5
Warning message:
“package ‘devtools’ was built under R version 4.3.3”
Warning message:
“package ‘usethis’ was built under R version 4.3.3”
Loading Package : grid v4.3.2
Loading Package : gridExtra v2.3
Loading Package : gtools v3.9.5
Loading Package : gtable v0.3.5
Warning message:
“package ‘gtable’ was built under R version 4.3.3”
Loading Package : ggplot2 v3.5.1
Warning message:
“package ‘ggplot2’ was built under R version 4.3.3”
Loading Package : magrittr v2.0.3
Loading Package : plyr v1.8.9
Loading Package : stringr v1.5.1
Loading Package : data.table v1.17.8
Warning message:
“package ‘data.table’ was built under R version 4.3.3”
Loading Package : matrixStats v1.5.0
Loading Package : sparseMatrixStats v1.14.0
Warning message:
“package ‘sparseMatrixStats’ was built under R version 4.3.3”
Warning message:
“package ‘MatrixGenerics’ was built under R version 4.3.3”
Loading Package : S4Vectors v0.40.2
Warning message:
“package ‘S4Vectors’ was built under R version 4.3.3”
Loading Package : GenomicRanges v1.54.1
Warning message:
“package ‘IRanges’ was built under R version 4.3.3”
Loading Package : BiocGenerics v0.48.1
Loading Package : Matrix v1.6.3
Loading Package : Rcpp v1.1.0
Loading Package : RcppArmadillo v0.12.8.2.1
Warning message:
“package ‘RcppArmadillo’ was built under R version 4.3.3”
Loading Package : SummarizedExperiment v1.32.0
Loading Package : rhdf5 v2.46.1
Warning message:
“package ‘rhdf5’ was built under R version 4.3.3”
Attaching package: ‘dplyr’
The following object is masked from ‘package:Biobase’:
combine
The following objects are masked from ‘package:GenomicRanges’:
intersect, setdiff, union
The following object is masked from ‘package:GenomeInfoDb’:
intersect
The following objects are masked from ‘package:IRanges’:
collapse, desc, intersect, setdiff, slice, union
The following objects are masked from ‘package:S4Vectors’:
first, intersect, rename, setdiff, setequal, union
The following objects are masked from ‘package:BiocGenerics’:
combine, intersect, setdiff, union
The following object is masked from ‘package:matrixStats’:
count
The following objects are masked from ‘package:data.table’:
between, first, last
The following objects are masked from ‘package:plyr’:
arrange, count, desc, failwith, id, mutate, rename, summarise,
summarize
The following object is masked from ‘package:gridExtra’:
combine
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
markdown cell
1. Create Arrow Files
code cell
In [2]:
setwd('/share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo')
code cell
In [3]:
inputFiles_example <- c("/share2/pub/zhouyj/zhouyj/Human_eye_atlas/GSE228370/SRR23990988/outs/fragments.tsv.gz",
"/share2/pub/zhouyj/zhouyj/Human_eye_atlas/GSE228370/SRR23990989/outs/fragments.tsv.gz")
names(inputFiles_example) <- c("fetal_eye_except_retina","fetal_retina")
code cell
In [4]:
addArchRGenome("hg38")
## Setting default genome to Hg38.
addArchRThreads(threads = 16)
## Setting default number of Parallel threads to 16.
Output:
Setting default genome to Hg38.
Setting default number of Parallel threads to 16.
code cell
In [6]:
ArrowFiles <- createArrowFiles(
inputFiles = inputFiles_example,
sampleNames = names(inputFiles_example),
minTSS = 4, #Dont set this too high because you can always increase later
minFrags = 1000,
addTileMat = TRUE,
addGeneScoreMat = TRUE
)
Output:
Using GeneAnnotation set by addArchRGenome(Hg38)!
Using GeneAnnotation set by addArchRGenome(Hg38)!
Warning message:
“package ‘XVector’ was built under R version 4.3.3”
Warning message:
“package ‘rtracklayer’ was built under R version 4.3.3”
ArchR logging to : ArchRLogs/ArchR-createArrows-6f9e0486afd59-Date-2025-10-29_Time-15-08-19.252704.log
If there is an issue, please report to github with logFile!
Cleaning Temporary Files
subThreading Disabled since ArchRLocking is TRUE see `addArchRLocking`
2025-10-29 15:08:19.788029 : Batch Execution w/ safelapply!, 0 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-createArrows-6f9e0486afd59-Date-2025-10-29_Time-15-08-19.252704.log
markdown cell
2. Doublet inference
code cell
In [7]:
doubScores <- addDoubletScores(
input = ArrowFiles,
k = 10, #Refers to how many cells near a "pseudo-doublet" to count.
knnMethod = "UMAP", #Refers to the embedding to use for nearest neighbor search with doublet projection.
LSIMethod = 1
)
Output:
ArchR logging to : ArchRLogs/ArchR-addDoubletScores-6f9e033bc5927-Date-2025-10-29_Time-15-23-15.87131.log
If there is an issue, please report to github with logFile!
2025-10-29 15:23:15.969165 : Batch Execution w/ safelapply!, 0 mins elapsed.
2025-10-29 15:23:15.978678 : fetal_eye_except_retina (1 of 2) : Computing Doublet Statistics, 0 mins elapsed.
Warning message:
“package ‘sp’ was built under R version 4.3.3”
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
fetal_eye_except_retina (1 of 2) : UMAP Projection R^2 = 0.97113
fetal_eye_except_retina (1 of 2) : UMAP Projection R^2 = 0.97113
2025-10-29 15:26:23.22016 : fetal_retina (2 of 2) : Computing Doublet Statistics, 3.121 mins elapsed.
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
fetal_retina (2 of 2) : UMAP Projection R^2 = 0.88655
fetal_retina (2 of 2) : UMAP Projection R^2 = 0.88655
ArchR logging successful to : ArchRLogs/ArchR-addDoubletScores-6f9e033bc5927-Date-2025-10-29_Time-15-23-15.87131.log
markdown cell
3. Creating An ArchRProject
code cell
In [8]:
projHeme1 <- ArchRProject(
ArrowFiles = ArrowFiles,
outputDirectory = "GSEGSE228370_26_post-conceptional_weeks",
copyArrows = TRUE
)
Output:
Using GeneAnnotation set by addArchRGenome(Hg38)!
Using GeneAnnotation set by addArchRGenome(Hg38)!
Validating Arrows...
Getting SampleNames...
Copying ArrowFiles to Ouptut Directory! If you want to save disk space set copyArrows = FALSE
1
2
Getting Cell Metadata...
Merging Cell Metadata...
Initializing ArchRProject...
/ |
/ \
. / |.
\\\ / |.
\\\ / `|.
\\\ / |.
\ / |\
\\#####\ / ||
==###########> / ||
\\##==......\ / ||
______ = =|__ /__ || \\\
,--' ,----`-,__ ___/' --,-`-===================##========>
\ ' ##_______ _____ ,--,__,=##,__ ///
, __== ___,-,__,--'#' ===' `-' | ##,-/
-,____,---' \\####\\________________,--\\_##,/
___ .______ ______ __ __ .______
/ \ | _ \ / || | | | | _ \
/ ^ \ | |_) | | ,----'| |__| | | |_) |
/ /_\ \ | / | | | __ | | /
/ _____ \ | |\ \\___ | `----.| | | | | |\ \\___.
/__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
code cell
In [9]:
head(projHeme1@cellColData)
Result:
DataFrame with 6 rows and 15 columns
Sample TSSEnrichment
<Rle> <array>
fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 fetal_eye_except_ret.. 20.692
fetal_eye_except_retina#TAAGTGCTCACCACAA-1 fetal_eye_except_ret.. 14.577
fetal_eye_except_retina#AATGTCGGTTCCAATG-1 fetal_eye_except_ret.. 24.611
fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 fetal_eye_except_ret.. 22.153
fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 fetal_eye_except_ret.. 13.669
fetal_eye_except_retina#ACTATTCCATTAGCCA-1 fetal_eye_except_ret.. 21.086
ReadsInTSS ReadsInPromoter
<array> <array>
fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 47891 43771
fetal_eye_except_retina#TAAGTGCTCACCACAA-1 21768 22048
fetal_eye_except_retina#AATGTCGGTTCCAATG-1 66295 58896
fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 49281 45561
fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 21123 20829
fetal_eye_except_retina#ACTATTCCATTAGCCA-1 45575 40526
ReadsInBlacklist PromoterRatio
<array> <array>
fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 1627 0.227511824938926
fetal_eye_except_retina#TAAGTGCTCACCACAA-1 1700 0.115316220004603
fetal_eye_except_retina#AATGTCGGTTCCAATG-1 1484 0.312740943702807
fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 1395 0.262448156682028
fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 1440 0.130371918931437
fetal_eye_except_retina#ACTATTCCATTAGCCA-1 1365 0.259952019910454
PassQC NucleosomeRatio
<array> <array>
fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 1 0.314821902080315
fetal_eye_except_retina#TAAGTGCTCACCACAA-1 1 0.593220338983051
fetal_eye_except_retina#AATGTCGGTTCCAATG-1 1 0.409870184317307
fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 1 0.356758784544204
fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 1 0.492164004856636
fetal_eye_except_retina#ACTATTCCATTAGCCA-1 1 0.506639349015212
nMultiFrags nMonoFrags nFrags
<array> <array> <array>
fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 8751 73162 96195
fetal_eye_except_retina#TAAGTGCTCACCACAA-1 5815 60003 95598
fetal_eye_except_retina#AATGTCGGTTCCAATG-1 6227 66787 94161
fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 10097 63976 86800
fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 11363 53535 79883
fetal_eye_except_retina#ACTATTCCATTAGCCA-1 5615 51737 77949
nDiFrags DoubletScore
<array> <array>
fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 14282 32.3282521623482
fetal_eye_except_retina#TAAGTGCTCACCACAA-1 29780 20.4734703975564
fetal_eye_except_retina#AATGTCGGTTCCAATG-1 21147 10.1135106689874
fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 12727 2.4897703541641
fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 14985 9.02590977818845
fetal_eye_except_retina#ACTATTCCATTAGCCA-1 20597 0.330631173274085
DoubletEnrichment
<array>
fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 3.36
fetal_eye_except_retina#TAAGTGCTCACCACAA-1 2.84
fetal_eye_except_retina#AATGTCGGTTCCAATG-1 2.32
fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 1.8
fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 2.24
fetal_eye_except_retina#ACTATTCCATTAGCCA-1 1.6
BlacklistRatio
<array>
fetal_eye_except_retina#ACAGAAAAGGTTCGTT-1 0.00845678049794688
fetal_eye_except_retina#TAAGTGCTCACCACAA-1 0.00889139940166112
fetal_eye_except_retina#AATGTCGGTTCCAATG-1 0.00788012021962384
fetal_eye_except_retina#GATGGCCTCCTGTAGA-1 0.00803571428571428
fetal_eye_except_retina#GCTGCGACAGCGTGAA-1 0.00901318177835084
fetal_eye_except_retina#ACTATTCCATTAGCCA-1 0.00875572489704807
code cell
In [10]:
df <- getCellColData(projHeme1, select = c("log10(nFrags)", "TSSEnrichment"))
code cell
In [11]:
p <- ggPoint(
x = df[,1],
y = df[,2],
colorDensity = TRUE,
continuousSet = "sambaNight",
xlabel = "Log10 Unique Fragments",
ylabel = "TSS Enrichment",
xlim = c(log10(500), quantile(df[,1], probs = 0.99)),
ylim = c(0, quantile(df[,2], probs = 0.99))
) + geom_hline(yintercept = 4, lty = "dashed") + geom_vline(xintercept = 3, lty = "dashed")
p
code cell
In [12]:
p1 <- plotGroups(
ArchRProj = projHeme1,
groupBy = "Sample",
colorBy = "cellColData",
name = "TSSEnrichment",
plotAs = "ridges",
baseSize = 10
)
p1
Output:
1
Picking joint bandwidth of 1.1
code cell
In [13]:
p2 <- plotGroups(
ArchRProj = projHeme1,
groupBy = "Sample",
colorBy = "cellColData",
name = "TSSEnrichment",
plotAs = "violin",
alpha = 0.4,
baseSize = 10,
addBoxPlot = TRUE,
)
p2
Output:
1
code cell
In [14]:
p3 <- plotFragmentSizes(ArchRProj = projHeme1)
p3
Output:
ArchR logging to : ArchRLogs/ArchR-plotFragmentSizes-6f9e065d2ae8c-Date-2025-10-29_Time-15-39-03.720819.log
If there is an issue, please report to github with logFile!
ArchR logging successful to : ArchRLogs/ArchR-plotFragmentSizes-6f9e065d2ae8c-Date-2025-10-29_Time-15-39-03.720819.log
code cell
In [15]:
p4 <- plotTSSEnrichment(ArchRProj = projHeme1)
p4
Output:
ArchR logging to : ArchRLogs/ArchR-plotTSSEnrichment-6f9e06e390d10-Date-2025-10-29_Time-15-39-35.463365.log
If there is an issue, please report to github with logFile!
subThreading Disabled since ArchRLocking is TRUE see `addArchRLocking`
2025-10-29 15:39:36.048239 : fetal_eye_except_retina Computing TSS (1 of 2)!, 0.01 mins elapsed.
2025-10-29 15:40:38.608483 : fetal_eye_except_retina Finished Computing TSS (1 of 2)!, 1.052 mins elapsed.
2025-10-29 15:40:38.65165 : fetal_retina Computing TSS (2 of 2)!, 1.053 mins elapsed.
2025-10-29 15:41:08.583292 : fetal_retina Finished Computing TSS (2 of 2)!, 1.552 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-plotTSSEnrichment-6f9e06e390d10-Date-2025-10-29_Time-15-39-35.463365.log
code cell
In [16]:
projHeme1 <- saveArchRProject(ArchRProj = projHeme1, outputDirectory = "Save-ProjHeme1", load = TRUE)
Output:
Copying ArchRProject to new outputDirectory : /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme1
Copying Arrow Files...
Copying Arrow Files (1 of 2)
Copying Arrow Files (2 of 2)
Getting ImputeWeights
No imputeWeights found, returning NULL
Copying Other Files...
Saving ArchRProject...
Loading ArchRProject...
Successfully loaded ArchRProject!
/ |
/ \
. / |.
\\\ / |.
\\\ / `|.
\\\ / |.
\ / |\
\\#####\ / ||
==###########> / ||
\\##==......\ / ||
______ = =|__ /__ || \\\
,--' ,----`-,__ ___/' --,-`-===================##========>
\ ' ##_______ _____ ,--,__,=##,__ ///
, __== ___,-,__,--'#' ===' `-' | ##,-/
-,____,---' \\####\\________________,--\\_##,/
___ .______ ______ __ __ .______
/ \ | _ \ / || | | | | _ \
/ ^ \ | |_) | | ,----'| |__| | | |_) |
/ /_\ \ | / | | | __ | | /
/ _____ \ | |\ \\___ | `----.| | | | | |\ \\___.
/__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
code cell
In [17]:
projHeme2 <- filterDoublets(projHeme1)
Output:
Filtering 297 cells from ArchRProject!
fetal_eye_except_retina : 285 of 5346 (5.3%)
fetal_retina : 12 of 1136 (1.1%)
code cell
rm(projHeme1)
gc()
markdown cell
4. Dimensionality Reduction
code cell
In [27]:
projHeme2 <- addIterativeLSI(
ArchRProj = projHeme2,
useMatrix = "TileMatrix",
name = "IterativeLSI",
iterations = 2,
clusterParams = list( #See Seurat::FindClusters
resolution = c(0.2), #or c(0.2,0.4,0.6)
sampleCells = 10000,
n.start = 10
),
varFeatures = 25000,
dimsToUse = 1:30
)
Output:
Checking Inputs...
ArchR logging to : ArchRLogs/ArchR-addIterativeLSI-6f9e030c91e29-Date-2025-10-29_Time-19-37-27.020365.log
If there is an issue, please report to github with logFile!
code cell
In [6]:
#Batch effect correction
projHeme2 <- addHarmony(
ArchRProj = projHeme2,
reducedDims = "IterativeLSI",
name = "Harmony",
groupBy = "Sample"
)
Output:
Warning message:
“package ‘harmony’ was built under R version 4.3.3”
Transposing data matrix
Initializing state using k-means centroids initialization
Harmony 1/10
Harmony 2/10
Harmony 3/10
Harmony converged after 3 iterations
code cell
In [7]:
packageVersion('harmony')
markdown cell
5. Clustering using Seurat
code cell
In [8]:
projHeme2 <- addClusters(
input = projHeme2,
reducedDims = "IterativeLSI",
method = "Seurat",
name = "Clusters",
resolution = 0.8
)
Output:
ArchR logging to : ArchRLogs/ArchR-addClusters-91a87078eb9c-Date-2025-10-30_Time-09-28-26.056578.log
If there is an issue, please report to github with logFile!
Warning message:
“package ‘sp’ was built under R version 4.3.3”
2025-10-30 09:28:38.222482 : Running Seurats FindClusters (Stuart et al. Cell 2019), 0.191 mins elapsed.
Computing nearest neighbor graph
Computing SNN
Output:
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 6185
Number of edges: 248893
Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.8538
Number of communities: 15
Elapsed time: 0 seconds
Output:
2025-10-30 09:28:44.91351 : Testing Outlier Clusters, 0.303 mins elapsed.
2025-10-30 09:28:44.985308 : Assigning Cluster Names to 15 Clusters, 0.304 mins elapsed.
2025-10-30 09:28:45.08451 : Finished addClusters, 0.306 mins elapsed.
code cell
In [9]:
cM <- confusionMatrix(paste0(projHeme2$Clusters), paste0(projHeme2$Sample))
cM
Result:
15 x 2 sparse Matrix of class "dgCMatrix"
fetal_eye_except_retina fetal_retina
C3 85 184
C1 330 .
C12 459 24
C5 247 34
C10 978 9
C8 85 1
C11 1668 5
C4 103 388
C6 117 136
C7 301 5
C14 223 69
C13 58 55
C15 223 155
C2 12 57
C9 172 2
code cell
In [10]:
#plot confusion matrix as a heatmap
library(pheatmap)
cM <- cM / Matrix::rowSums(cM)
p5 <- pheatmap::pheatmap(
mat = as.matrix(cM),
color = paletteContinuous("whiteBlue"),
border_color = "black"
)
p5
Output:
Warning message:
“package ‘pheatmap’ was built under R version 4.3.3”
markdown cell
6. Single-cell Embedding
code cell
In [11]:
projHeme2 <- addUMAP(
ArchRProj = projHeme2,
reducedDims = "IterativeLSI",
name = "UMAP",
nNeighbors = 30,
minDist = 0.5,
metric = "cosine"
)
Output:
09:31:46 UMAP embedding parameters a = 0.583 b = 1.334
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
09:31:46 Read 6185 rows and found 30 numeric columns
09:31:46 Using Annoy for neighbor search, n_neighbors = 30
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
09:31:46 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
|
09:31:48 Writing NN index file to temp file /tmp/RtmpJIsaLF/file91a850ab997b
09:31:48 Searching Annoy index using 20 threads, search_k = 3000
09:31:48 Annoy recall = 90.9%
09:31:49 Commencing smooth kNN distance calibration using 20 threads
with target n_neighbors = 30
09:31:50 Initializing from normalized Laplacian + noise (using RSpectra)
09:31:50 Commencing optimization for 500 epochs, with 299900 positive edges
09:32:12 Optimization finished
09:32:12 Creating temp model dir /tmp/RtmpJIsaLF/dir91a817c363cf
09:32:12 Creating dir /tmp/RtmpJIsaLF/dir91a817c363cf
09:32:13 Changing to /tmp/RtmpJIsaLF/dir91a817c363cf
09:32:13 Creating /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme2/Embeddings/Save-Uwot-UMAP-Params-IterativeLSI-91a8276ff40d-Date-2025-10-30_Time-09-32-12.614686.tar
code cell
In [13]:
#Plot and save by sample and clusters
p6 <- plotEmbedding(ArchRProj = projHeme2, colorBy = "cellColData", name = "Sample", embedding = "UMAP")
p7 <- plotEmbedding(ArchRProj = projHeme2, colorBy = "cellColData", name = "Clusters", embedding = "UMAP")
ggAlignPlots(p6, p7, type = "h")
plotPDF(p6,p7, name = "Plot-UMAP-Sample-Clusters.pdf", ArchRProj = projHeme2, addDOC = FALSE, width = 5, height = 5)
Output:
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a864fd6c48-Date-2025-10-30_Time-09-34-32.389676.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = cellColData
Plotting Embedding
1
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a864fd6c48-Date-2025-10-30_Time-09-34-32.389676.log
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a8669fb1a8-Date-2025-10-30_Time-09-34-32.864689.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = cellColData
Plotting Embedding
1
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a8669fb1a8-Date-2025-10-30_Time-09-34-32.864689.log
Plotting Ggplot!
Plotting Ggplot!
code cell
In [14]:
#Vesion after harmony
projHeme2 <- addUMAP(
ArchRProj = projHeme2,
reducedDims = "Harmony",
name = "UMAPHarmony",
nNeighbors = 30,
minDist = 0.5,
metric = "cosine"
)
Output:
09:36:55 UMAP embedding parameters a = 0.583 b = 1.334
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
09:36:55 Read 6185 rows and found 30 numeric columns
09:36:55 Using Annoy for neighbor search, n_neighbors = 30
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
09:36:55 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
|
09:36:57 Writing NN index file to temp file /tmp/RtmpJIsaLF/file91a84363871a
09:36:57 Searching Annoy index using 20 threads, search_k = 3000
09:36:57 Annoy recall = 100%
09:36:58 Commencing smooth kNN distance calibration using 20 threads
with target n_neighbors = 30
09:36:59 Initializing from normalized Laplacian + noise (using RSpectra)
09:36:59 Commencing optimization for 500 epochs, with 290260 positive edges
09:37:20 Optimization finished
09:37:20 Creating temp model dir /tmp/RtmpJIsaLF/dir91a87a4a6e04
09:37:20 Creating dir /tmp/RtmpJIsaLF/dir91a87a4a6e04
09:37:21 Changing to /tmp/RtmpJIsaLF/dir91a87a4a6e04
09:37:21 Creating /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme2/Embeddings/Save-Uwot-UMAP-Params-Harmony-91a87236768c-Date-2025-10-30_Time-09-37-20.579837.tar
code cell
In [15]:
p8 <- plotEmbedding(ArchRProj = projHeme2, colorBy = "cellColData", name = "Sample", embedding = "UMAPHarmony")
p9 <- plotEmbedding(ArchRProj = projHeme2, colorBy = "cellColData", name = "Clusters", embedding = "UMAPHarmony")
ggAlignPlots(p8, p9, type = "h")
plotPDF(p8,p9, name = "Plot-UMAPHarmony-Sample-Clusters.pdf", ArchRProj = projHeme2, addDOC = FALSE, width = 5, height = 5)
Output:
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a8465bb1fd-Date-2025-10-30_Time-09-37-38.971252.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = cellColData
Plotting Embedding
1
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a8465bb1fd-Date-2025-10-30_Time-09-37-38.971252.log
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a8586fdecc-Date-2025-10-30_Time-09-37-45.172277.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = cellColData
Plotting Embedding
1
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a8586fdecc-Date-2025-10-30_Time-09-37-45.172277.log
Plotting Ggplot!
Plotting Ggplot!
markdown cell
7. Gene Scores and Marker Genes
code cell
In [19]:
markersGS <- getMarkerFeatures(
ArchRProj = projHeme2,
useMatrix = "GeneScoreMatrix",
groupBy = "Clusters",
bias = c("TSSEnrichment", "log10(nFrags)"),
testMethod = "wilcoxon"
)
Output:
ArchR logging to : ArchRLogs/ArchR-getMarkerFeatures-91a831f7580f-Date-2025-10-30_Time-10-01-28.703741.log
If there is an issue, please report to github with logFile!
MatrixClass = Sparse.Double.Matrix
2025-10-30 10:01:34.688203 : Matching Known Biases, 0.007 mins elapsed.
###########
2025-10-30 10:02:20.264148 : Completed Pairwise Tests, 0.767 mins elapsed.
###########
ArchR logging successful to : ArchRLogs/ArchR-getMarkerFeatures-91a831f7580f-Date-2025-10-30_Time-10-01-28.703741.log
code cell
In [18]:
projHeme2
Output:
___ .______ ______ __ __ .______
/ \ | _ \ / || | | | | _ \
/ ^ \ | |_) | | ,----'| |__| | | |_) |
/ /_\ \ | / | | | __ | | /
/ _____ \ | |\ \\___ | `----.| | | | | |\ \\___.
/__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
Result:
class: ArchRProject
outputDirectory: /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme2
samples(2): fetal_eye_except_retina fetal_retina
sampleColData names(1): ArrowFiles
cellColData names(16): Sample TSSEnrichment ... BlacklistRatio Clusters
numberOfCells(1): 6185
medianTSS(1): 17.885
medianFrags(1): 9945
code cell
In [20]:
markerList <- getMarkers(markersGS, cutOff = "FDR <= 0.01 & Log2FC >= 1.25")
markerList$C1
Result:
DataFrame with 333 rows and 9 columns
seqnames start end strand name idx Log2FC
<Rle> <integer> <integer> <integer> <character> <integer> <numeric>
15911 chr22 37988794 37988853 1 MIR4534 340 3.08574
15909 chr22 37952607 38041106 1 POLR2F 338 2.39788
14341 chr2 222298996 222199888 2 PAX3 1392 2.71089
15910 chr22 37987422 37970686 2 SOX10 339 3.34180
11420 chr19 3539154 3544030 1 C19orf71 124 3.27196
... ... ... ... ... ... ... ...
648 chr1 40317816 40300487 2 COL9A2 648 1.40391
17414 chr3 186635969 186653141 1 FETUB 1282 1.44268
9830 chr17 17813545 17813480 2 MIR6777 344 1.29721
24613 chrX 119565408 119538149 2 CXorf56 650 1.49323
2364 chr1 241132272 241132346 1 MIR3123 2364 2.71766
FDR MeanDiff
<numeric> <numeric>
15911 7.92171e-48 2.560543
15909 6.27628e-47 1.762458
14341 6.87379e-44 1.080938
15910 2.83076e-34 0.922795
11420 9.84625e-34 1.537612
... ... ...
648 0.00901485 0.313458
17414 0.00923812 0.073252
9830 0.00925712 0.353038
24613 0.00969600 0.203829
2364 0.00975850 0.740864
code cell
In [21]:
markerGenes <- gene_markers <- c(
"CAND1","HES1","SOX2", ##RPCs
"ATOH7","HES6","DLL3", ##NPCs
"RGR","RLBP1","GPX3", ##MGCs
"NRL","RHO","GNGT1", ##Rods
"ONECUT2","ONECUT1", ##HCs
"VSX1","TMEM215", ##BCs
"GAD2","TFAP2A", ##ACs
"GAP43","SNCG","NEFM", ##RGCs
"DCN","MGP", ##Fibroblasts
"PMEL","TYRP1", ##Melanocytes & RPE
"RPE65","TTR", ##RPE
"PDE6H","THRB","PRDM1", ##Cones
"CRYGC","LIM2", ##Lens cells
"C1QA","C1QB", ##Microphages
"TMEM119","CX3CR1", ##Microglia
"KRT5", ##Corneal epithelium
"PECAM1","ENG" ## Endothelium
)
heatmapGS <- markerHeatmap(
seMarker = markersGS,
cutOff = "FDR <= 0.01 & Log2FC >= 1.25",
labelMarkers = markerGenes,
transpose = TRUE
)
Output:
Warning message:
“'markerHeatmap' is deprecated.
Use 'plotMarkerHeatmap' instead.
See help("Deprecated")”
ArchR logging to : ArchRLogs/ArchR-plotMarkerHeatmap-91a8425062b4-Date-2025-10-30_Time-10-07-58.306658.log
If there is an issue, please report to github with logFile!
Printing Top Marker Genes:
C1:
SSU72, TNFRSF14, RBP7, NPPA-AS1, AKR7L, MIR4425, TRIM63, TRNP1, PABPC4-AS1, COL9A2, RNU5F-1, MIR4711, LINC00466, IL12RB2, PROK1
C2:
EPHA10, FAM78B, CAMK1D, PLEKHA7, LOC101928943, P3H3, MYO16, MEIS2, MIR8063, GSE1, TMEM132E, SLC1A6, SULT2A1, THRB, PROM1
C3:
LINC01654, LOC101927342, OVGP1, LINC01750, C1orf68, LCE1F, IVL, SPRR4, FCRL4, FCRL2, CCDC181, LINC01699, KCNK2, SUSD4, OR2C3
C4:
AKR7A3, LACTBL1, CNKSR1, DNAJC6, KIAA1107, LINC01356, FCRL5, KIF21B, MIR135B, USH2A, GPATCH2, SPHAR, LOC101927787, PRKCQ-AS1, C1QL3
C5:
DISP3, ZDHHC18, MPL, MED8, KLF17, LINC01398, DMBX1, TTLL7, WDR47, SLC6A17, MAEL, LOC102724919, PDC, PKP1, KCNMA1-AS3
C6:
CSF3R, NTNG1, CELF3, RIIAD1, CADM3, FCER1A, LHX4, RGS8, RGS18, DRD4, CEND1, SLC6A5, LINC00678, BDNF, SLC1A2
C7:
LOC102724659, TIE1, BTBD19, CYP4B1, TAL1, S1PR1, LOC101928370, MIR7852, MOV10, S100A16, MIR8083, TPM3, SHE, TDRD10, DCST1
C8:
TMEM132C, FAM87B, MIR6726, MIR6727, SSU72, PLCH2, TNFRSF14, RBP7, CORT, CASZ1, DISP3, LOC102724659, NPPA-AS1, LINC01654, AKR7L
C9:
CLMAT3, FAM87B, MIR6726, MIR6727, SSU72, PLCH2, TNFRSF14, RBP7, CORT, CASZ1, DISP3, LOC102724659, NPPA-AS1, LINC01654, AKR7L
C10:
MIR942, FAM99B, NAV2-AS5, ALKBH3-AS1, CCDC89, C1R, CSNK1A1L, MIR548X2, MIR337, MIR1276, FOXL1, MFAP4, ABCA8, ELANE, COL5A3
C11:
FAM87B, MIR6727, CORT, FAM43B, COL8A2, NGF, S100A14, MIR214, PITRM1-AS1, GPRC5D-AS1, GLIPR1, SERTM1, TGM1, MIR342, USP50
C12:
MIR6127, THEMIS2, DENND2D, CHI3L2, C1orf162, TMIGD3, CD1C, CD1E, CD48, ATP6V1G3, PTPRC, NLRP3, OR2T1, OR2T2, MRC1
C13:
FAM181A, LINC00511, LINC00297, MIR4534, SLC35F1, FAM87B, MIR6726, MIR6727, SSU72, PLCH2, TNFRSF14, RBP7, CORT, CASZ1, DISP3
C14:
PIFO, LMX1A, CACNA1E, LINC01031, SLIT1-AS1, LOC101927549, LMO1, CALCB, CALCA, PAX6, PAX6-AS1, PAUPAR, OR5A2, LGALS12, NTM-IT
C15:
BARHL2, LINC02609, IGSF9B, B3GAT1, LOC283177, SLC6A12, KRT77, DTX1, ONECUT3, CHST8, MYT1L-AS1, SLC32A1, LOC100129027, MAB21L2, LINC01287
Identified 1719 markers!
Output:
[1] "CAND1" "HES1" "SOX2" "ATOH7" "HES6" "DLL3" "RGR"
[8] "RLBP1" "GPX3" "NRL" "RHO" "GNGT1" "ONECUT2" "ONECUT1"
[15] "VSX1" "TMEM215" "GAD2" "TFAP2A" "GAP43" "SNCG" "NEFM"
[22] "DCN" "MGP" "PMEL" "TYRP1" "RPE65" "TTR" "PDE6H"
[29] "THRB" "PRDM1" "CRYGC" "LIM2" "C1QA" "C1QB" "TMEM119"
[36] "CX3CR1" "KRT5" "PECAM1" "ENG"
Output:
Adding Annotations..
Preparing Main Heatmap..
ArchR logging successful to : ArchRLogs/ArchR-plotMarkerHeatmap-91a8425062b4-Date-2025-10-30_Time-10-07-58.306658.log
code cell
In [22]:
##Set order by @row_order
#Exp: heatmapGS@row_order <- c(10,11,12,3,4,5,1,2,6,8,7,9)
ComplexHeatmap::draw(heatmapGS, heatmap_legend_side = "bot", annotation_legend_side = "bot")
code cell
In [23]:
markerGenes <- c(
"CCND1", ##RPCs
"ATOH7", ##NPCs
"GAP43", ##RGCs
"ONECUT1", ##HCs
"TFAP2A", ##ACs
"PRDM1", ##Cones
"NRL" ##Rods
)
p8 <- plotEmbedding(
ArchRProj = projHeme2,
colorBy = "GeneScoreMatrix",
name = markerGenes,
embedding = "UMAPHarmony",
quantCut = c(0.01, 0.95),
imputeWeights = NULL
)
Output:
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a85b0b3c9b-Date-2025-10-30_Time-10-14-59.339542.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = GeneScoreMatrix
Getting Matrix Values...
2025-10-30 10:15:07.568725 :
Plotting Embedding
1
2
3
4
5
6
7
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a85b0b3c9b-Date-2025-10-30_Time-10-14-59.339542.log
code cell
In [24]:
p8$CCND1
code cell
In [25]:
p9 <- lapply(p8, function(x){
x + guides(color = FALSE, fill = FALSE) +
theme_ArchR(baseSize = 6.5) +
theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) +
theme(
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank()
)
})
do.call(cowplot::plot_grid, c(list(ncol = 3),p9))
Output:
Warning message:
“[1m[22mThe `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as of ggplot2 3.3.4.”
code cell
In [26]:
#Marker Genes Imputation with MAGIC
projHeme2 <- addImputeWeights(projHeme2)
markerGenes <- c(
"CCND1", ##RPCs
"ATOH7", ##NPCs
"GAP43", ##RGCs
"ONECUT1", ##HCs
"TFAP2A", ##ACs
"PRDM1", ##Cones
"NRL" ##Rods
)
p10 <- plotEmbedding(
ArchRProj = projHeme2,
colorBy = "GeneScoreMatrix",
name = markerGenes,
embedding = "UMAPHarmony",
quantCut = c(0.01, 0.95),
imputeWeights = getImputeWeights(projHeme2)
)
Output:
ArchR logging to : ArchRLogs/ArchR-addImputeWeights-91a878d0a9f-Date-2025-10-30_Time-10-17-45.350946.log
If there is an issue, please report to github with logFile!
2025-10-30 10:17:45.519319 : Computing Impute Weights Using Magic (Cell 2018), 0 mins elapsed.
Getting ImputeWeights
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a87fa0d8c7-Date-2025-10-30_Time-10-17-52.494176.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = GeneScoreMatrix
Getting Matrix Values...
2025-10-30 10:17:52.894195 :
Imputing Matrix
Using weights on disk
1 of 1
Using weights on disk
1 of 1
Plotting Embedding
1
2
3
4
5
6
7
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a87fa0d8c7-Date-2025-10-30_Time-10-17-52.494176.log
code cell
In [27]:
p10$CCND1
code cell
In [28]:
p11 <- lapply(p10, function(x){
x + guides(color = FALSE, fill = FALSE) +
theme_ArchR(baseSize = 6.5) +
theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) +
theme(
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank()
)
})
do.call(cowplot::plot_grid, c(list(ncol = 3),p11))
code cell
In [42]:
#Module Scores
#Each group needs atleast 2 genes
#features <- list(
# RPCsScore = c("CAND1","HES1","SOX2"),
# NPCsScore = c("ATOH7","HES6","DLL3"),
# MGCsScore = c("RGR","RLBP1","GPX3"),
# RodsScore = c("NRL","RHO","GNGT1"),
# HCscore = c("ONECUT2","ONECUT1"),
# BCscore = c("VSX1","TMEM215"),
# ACscore = c("GAD2","TFAP2A"),
# RGCsScore = c("GAP43","SNCG","NEFM"),
# FibroblastsScore = c("DCN","MGP"),
# MelanocytesRPEScore = c("PMEL","TYRP1"),
# RPEScore = c("RPE65","TTR"),
# ConesScore = c("PDE6H","THRB","PRDM1"),
# LensCellsScore = c("CRYGC","LIM2"),
# MicrophagesScore = c("C1QA","C1QB"),
# MicrogliaScore = c("TMEM119","CX3CR1"),
# EndotheliumScore = c("PECAM1","ENG")
#)
features <- list(
RPCsScore = c("CAND1","HES1","SOX2")
)
projHeme2 <- addModuleScore(projHeme2,
useMatrix = "GeneScoreMatrix",
name = "Module",
features = features)
Output:
ArchR logging to : ArchRLogs/ArchR-addModuleScore-91a81df8a4af-Date-2025-10-30_Time-10-44-41.57271.log
If there is an issue, please report to github with logFile!
2025-10-30 10:44:43.343194 : Computing Module 1 of 1, 0.026 mins elapsed.
Overriding previous entry for Module.RPCsScore
2025-10-30 10:44:51.719661 : Finished Running addModuleScore, 0.166 mins elapsed.
code cell
In [43]:
p12 <- plotEmbedding(projHeme2,
embedding = "UMAPHarmony",
colorBy = "cellColData",
name="Module.RPCsScore",
imputeWeights = getImputeWeights(projHeme2))
Output:
Getting ImputeWeights
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a82e3b21fc-Date-2025-10-30_Time-10-44-51.745916.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = cellColData
Imputing Matrix
Using weights on disk
1 of 1
Using weights on disk
1 of 1
Plotting Embedding
1
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a82e3b21fc-Date-2025-10-30_Time-10-44-51.745916.log
code cell
In [39]:
#Track Plotting
set.seed(1234)
markerGenes <- c(
"CCND1", ##RPCs
"ATOH7", ##NPCs
"GAP43", ##RGCs
"ONECUT1", ##HCs
"TFAP2A", ##ACs
"PRDM1", ##Cones
"NRL" ##Rods
)
p <- plotBrowserTrack(
ArchRProj = projHeme2,
groupBy = "Clusters",
geneSymbol = markerGenes,
upstream = 50000,
downstream = 50000
)
Output:
ArchR logging to : ArchRLogs/ArchR-plotBrowserTrack-91a8e0dc389-Date-2025-10-30_Time-10-41-27.542857.log
If there is an issue, please report to github with logFile!
2025-10-30 10:41:27.715202 : Validating Region, 0.003 mins elapsed.
Output:
GRanges object with 7 ranges and 2 metadata columns:
seqnames ranges strand | gene_id symbol
<Rle> <IRanges> <Rle> | <character> <character>
[1] chr11 69641087-69654474 + | 595 CCND1
[2] chr10 68230624-68232103 - | 220202 ATOH7
[3] chr3 115623324-115721490 + | 2596 GAP43
[4] chr15 52756989-52791078 - | 3175 ONECUT1
[5] chr6 10393186-10419659 - | 7020 TFAP2A
[6] chr6 106086320-106109939 + | 639 PRDM1
[7] chr14 24080107-24115014 - | 4901 NRL
-------
seqinfo: 24 sequences from hg38 genome
Output:
2025-10-30 10:41:27.844366 : Adding Bulk Tracks (1 of 7), 0.005 mins elapsed.
2025-10-30 10:41:28.942532 : Adding Gene Tracks (1 of 7), 0.023 mins elapsed.
2025-10-30 10:41:29.423961 : Plotting, 0.031 mins elapsed.
2025-10-30 10:41:31.003763 : Adding Bulk Tracks (2 of 7), 0.058 mins elapsed.
2025-10-30 10:41:31.868226 : Adding Gene Tracks (2 of 7), 0.072 mins elapsed.
2025-10-30 10:41:32.125812 : Plotting, 0.076 mins elapsed.
2025-10-30 10:41:33.399572 : Adding Bulk Tracks (3 of 7), 0.098 mins elapsed.
2025-10-30 10:41:34.412379 : Adding Gene Tracks (3 of 7), 0.115 mins elapsed.
2025-10-30 10:41:34.660011 : Plotting, 0.119 mins elapsed.
2025-10-30 10:41:35.881685 : Adding Bulk Tracks (4 of 7), 0.139 mins elapsed.
2025-10-30 10:41:36.659171 : Adding Gene Tracks (4 of 7), 0.152 mins elapsed.
2025-10-30 10:41:36.9099 : Plotting, 0.156 mins elapsed.
2025-10-30 10:41:38.170023 : Adding Bulk Tracks (5 of 7), 0.177 mins elapsed.
2025-10-30 10:41:39.095898 : Adding Gene Tracks (5 of 7), 0.193 mins elapsed.
2025-10-30 10:41:39.363946 : Plotting, 0.197 mins elapsed.
2025-10-30 10:41:40.773479 : Adding Bulk Tracks (6 of 7), 0.221 mins elapsed.
2025-10-30 10:41:41.702989 : Adding Gene Tracks (6 of 7), 0.236 mins elapsed.
2025-10-30 10:41:41.961645 : Plotting, 0.24 mins elapsed.
2025-10-30 10:41:43.310858 : Adding Bulk Tracks (7 of 7), 0.263 mins elapsed.
2025-10-30 10:41:44.239631 : Adding Gene Tracks (7 of 7), 0.278 mins elapsed.
2025-10-30 10:41:44.504083 : Plotting, 0.283 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-plotBrowserTrack-91a8e0dc389-Date-2025-10-30_Time-10-41-27.542857.log
code cell
In [40]:
grid::grid.newpage()
grid::grid.draw(p$CCND1)
plotPDF(plotList = p,
name = "Plot-Tracks-Marker-Genes.pdf",
ArchRProj = projHeme2,
addDOC = FALSE, width = 5, height = 5)
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
Output:
Plotting Gtable!
Output:
NULL
markdown cell
8. Defining Cluster Identity with scRNA-seq
code cell
In [46]:
#Cross-platform linkage of scATAC-seq cells with scRNA-seq cells
#sc/snRNA-seq data needs Seurat v4 or SingleCellExperiment
seRNA <- readRDS("/share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/data/hro_rna_final_anno.rds")
projHeme2 <- addGeneIntegrationMatrix(
ArchRProj = projHeme2,
useMatrix = "GeneScoreMatrix",
matrixName = "GeneIntegrationMatrix",
reducedDims = "IterativeLSI",
seRNA = seRNA,
addToArrow = FALSE,
groupRNA = "Cell_type",
nameCell = "predictedCell_Un",
nameGroup = "predictedGroup_Un",
nameScore = "predictedScore_Un"
)
Output:
ArchR logging to : ArchRLogs/ArchR-addGeneIntegrationMatrix-91a81b9703b5-Date-2025-10-31_Time-17-16-14.50102.log
If there is an issue, please report to github with logFile!
2025-10-31 17:16:14.650241 : Running Seurat's Integration Stuart* et al 2019, 0.002 mins elapsed.
2025-10-31 17:16:14.676977 : Checking ATAC Input, 0.003 mins elapsed.
2025-10-31 17:16:14.895387 : Checking RNA Input, 0.007 mins elapsed.
2025-10-31 17:16:20.153665 : Found 21369 overlapping gene names from gene scores and rna matrix!, 0.094 mins elapsed.
2025-10-31 17:16:20.156279 : Creating Integration Blocks, 0.094 mins elapsed.
2025-10-31 17:16:20.453998 : Prepping Interation Data, 0.099 mins elapsed.
subThreading Disabled since ArchRLocking is TRUE see `addArchRLocking`
2025-10-31 17:16:20.998352 : Computing Integration in 1 Integration Blocks!, 0 mins elapsed.
2025-10-31 17:16:21.001557 : Block (1 of 1) : Computing Integration, 0 mins elapsed.
2025-10-31 17:16:24.872289 : Block (1 of 1) : Identifying Variable Genes, 0.065 mins elapsed.
2025-10-31 17:16:27.485666 : Block (1 of 1) : Getting GeneScoreMatrix, 0.108 mins elapsed.
2025-10-31 17:16:37.594323 : Block (1 of 1) : Imputing GeneScoreMatrix, 0.277 mins elapsed.
Getting ImputeWeights
2025-10-31 17:17:17.107258 : Block (1 of 1) : Seurat FindTransferAnchors, 0.935 mins elapsed.
2025-10-31 17:17:34.653529 : Block (1 of 1) : Seurat TransferData Cell Group Labels, 1.228 mins elapsed.
2025-10-31 17:17:36.469198 : Block (1 of 1) : Seurat TransferData Cell Names Labels, 1.258 mins elapsed.
2025-10-31 17:17:49.629022 : Block (1 of 1) : Saving TransferAnchors Joint CCA, 1.477 mins elapsed.
2025-10-31 17:17:51.429326 : Block (1 of 1) : Completed Integration, 1.507 mins elapsed.
2025-10-31 17:17:52.821755 : Block (1 of 1) : Plotting Joint UMAP, 1.53 mins elapsed.
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
2025-10-31 17:18:30.471151 : Completed Integration with RNA Matrix, 2.158 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-addGeneIntegrationMatrix-91a81b9703b5-Date-2025-10-31_Time-17-16-14.50102.log
code cell
In [61]:
cM <- as.matrix(confusionMatrix(projHeme2$Clusters, projHeme2$predictedGroup_Un))
preClust <- colnames(cM)[apply(cM, 1 , which.max)]
cblist <- cbind(preClust, rownames(cM)) %>% as.data.frame() #Assignments
colnames(cblist) <- c('RNA_anno','ATAC_clusters')
cblist
A data.frame: 15 × 2
| RNA_anno | ATAC_clusters |
| <chr> | <chr> |
| BCs | C3 |
| Corneal epithelium | C1 |
| ACs | C12 |
| Rods | C5 |
| Fibroblasts | C10 |
| Fibroblasts | C8 |
| Fibroblasts | C11 |
| Rods | C4 |
| BCs | C6 |
| Microglia | C7 |
| RPE | C14 |
| Fibroblasts | C13 |
| ACs | C15 |
| Cones | C2 |
| RPCs | C9 |
code cell
In [68]:
celltype_groups <- list(
"Photoreceptors" = c("Cones", "Rods", "BCs"), # 光感受器及相关
"Retinal_Neurons" = c("HCs", "ACs", "RGCs"), # 其他视网膜神经元
"Glial_Cells" = c("MGCs", "Microglia", "Microphages"), # 胶质细胞
"Epithelial_Cells" = c("RPE", "Corneal epithelium", "Lens cells"), # 上皮细胞
"Progenitor_Cells" = c("RPCs", "NPCs"), # 前体/祖细胞
"Stromal_Cells" = c("Fibroblasts") # 间质细胞
)
clustPhotoreceptors <- cblist %>% subset(RNA_anno %in% celltype_groups$Photoreceptors) %>% pull(ATAC_clusters)
clustRetinal_Neurons <- cblist %>% subset(RNA_anno %in% celltype_groups$Retinal_Neurons) %>% pull(ATAC_clusters)
clustGlial_Cells <- cblist %>% subset(RNA_anno %in% celltype_groups$Glial_Cells) %>% pull(ATAC_clusters)
clustEpithelial_Cells <- cblist %>% subset(RNA_anno %in% celltype_groups$Epithelial_Cells) %>% pull(ATAC_clusters)
clustProgenitor_Cells <- cblist %>% subset(RNA_anno %in% celltype_groups$Progenitor_Cells) %>% pull(ATAC_clusters)
clustStromal_Cells <- cblist %>% subset(RNA_anno %in% celltype_groups$Stromal_Cells) %>% pull(ATAC_clusters)
rnaPhotoreceptors <- seRNA %>% subset(Cell_type %in% celltype_groups$Photoreceptors) %>% colnames()
rnaRetinal_Neurons <- seRNA %>% subset(Cell_type %in% celltype_groups$Retinal_Neurons) %>% colnames()
rnaGlial_Cells <- seRNA %>% subset(Cell_type %in% celltype_groups$Glial_Cells) %>% colnames()
rnaEpithelial_Cells <- seRNA %>% subset(Cell_type %in% celltype_groups$Epithelial_Cells) %>% colnames()
rnaProgenitor_Cells <- seRNA %>% subset(Cell_type %in% celltype_groups$Progenitor_Cells) %>% colnames()
rnaStromal_Cells <- seRNA %>% subset(Cell_type %in% celltype_groups$Stromal_Cells) %>% colnames()
groupList <- SimpleList(
Photoreceptors = SimpleList(
ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustPhotoreceptors],
RNA = rnaPhotoreceptors
),
Retinal_Neurons = SimpleList(
ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustRetinal_Neurons],
RNA = rnaRetinal_Neurons
) ,
Glial_Cells = SimpleList(
ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustGlial_Cells],
RNA = rnaGlial_Cells
) ,
Epithelial_Cells = SimpleList(
ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustEpithelial_Cells],
RNA = rnaEpithelial_Cells
) ,
Progenitor_Cells = SimpleList(
ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustProgenitor_Cells],
RNA = rnaProgenitor_Cells
) ,
Stromal_Cells = SimpleList(
ATAC = projHeme2$cellNames[projHeme2$Clusters %in% clustStromal_Cells],
RNA = rnaStromal_Cells
)
)
code cell
In [69]:
#Adding Pseudo-scRNA-seq profiles for each scATAC-seq cell
projHeme3 <- addGeneIntegrationMatrix(
ArchRProj = projHeme2,
useMatrix = "GeneScoreMatrix",
matrixName = "GeneIntegrationMatrix",
reducedDims = "IterativeLSI",
seRNA = seRNA,
addToArrow = TRUE,
force= TRUE,
groupList = groupList,
groupRNA = "Cell_type",
nameCell = "predictedCell",
nameGroup = "predictedGroup",
nameScore = "predictedScore"
)
Output:
ArchR logging to : ArchRLogs/ArchR-addGeneIntegrationMatrix-91a864e5a7da-Date-2025-10-31_Time-17-52-05.274252.log
If there is an issue, please report to github with logFile!
2025-10-31 17:52:05.438426 : Running Seurat's Integration Stuart* et al 2019, 0.003 mins elapsed.
2025-10-31 17:52:05.460565 : Checking ATAC Input, 0.003 mins elapsed.
2025-10-31 17:52:05.719466 : Checking RNA Input, 0.007 mins elapsed.
2025-10-31 17:52:10.959871 : Found 21369 overlapping gene names from gene scores and rna matrix!, 0.095 mins elapsed.
2025-10-31 17:52:10.962484 : Creating Integration Blocks, 0.095 mins elapsed.
2025-10-31 17:52:11.163365 : Prepping Interation Data, 0.098 mins elapsed.
subThreading Disabled since ArchRLocking is TRUE see `addArchRLocking`
2025-10-31 17:52:11.699249 : Computing Integration in 6 Integration Blocks!, 0 mins elapsed.
2025-10-31 17:52:11.706991 : Block (1 of 6) : Computing Integration, 0 mins elapsed.
2025-10-31 17:52:15.239796 : Block (1 of 6) : Identifying Variable Genes, 0.059 mins elapsed.
2025-10-31 17:52:17.644931 : Block (1 of 6) : Getting GeneScoreMatrix, 0.099 mins elapsed.
2025-10-31 17:52:23.444779 : Block (1 of 6) : Imputing GeneScoreMatrix, 0.196 mins elapsed.
Getting ImputeWeights
2025-10-31 17:52:25.743501 : Block (1 of 6) : Seurat FindTransferAnchors, 0.234 mins elapsed.
2025-10-31 17:52:36.354176 : Block (1 of 6) : Seurat TransferData Cell Group Labels, 0.411 mins elapsed.
2025-10-31 17:52:36.791763 : Block (1 of 6) : Seurat TransferData Cell Names Labels, 0.418 mins elapsed.
2025-10-31 17:52:39.999354 : Block (1 of 6) : Seurat TransferData GeneMatrix, 0.472 mins elapsed.
2025-10-31 17:52:42.303766 : Block (1 of 6) : Saving TransferAnchors Joint CCA, 0.51 mins elapsed.
2025-10-31 17:52:43.409859 : Block (1 of 6) : Transferring Paired RNA to Temp File, 0.529 mins elapsed.
2025-10-31 17:52:44.873069 : Block (1 of 6) : Completed Integration, 0.553 mins elapsed.
2025-10-31 17:52:45.872324 : Block (2 of 6) : Computing Integration, 0.57 mins elapsed.
2025-10-31 17:52:49.095483 : Block (2 of 6) : Identifying Variable Genes, 0.623 mins elapsed.
2025-10-31 17:52:51.537902 : Block (2 of 6) : Getting GeneScoreMatrix, 0.664 mins elapsed.
2025-10-31 17:52:56.345476 : Block (2 of 6) : Imputing GeneScoreMatrix, 0.744 mins elapsed.
Getting ImputeWeights
2025-10-31 17:52:57.888306 : Block (2 of 6) : Seurat FindTransferAnchors, 0.77 mins elapsed.
2025-10-31 17:53:07.39262 : Block (2 of 6) : Seurat TransferData Cell Group Labels, 0.928 mins elapsed.
2025-10-31 17:53:07.667436 : Block (2 of 6) : Seurat TransferData Cell Names Labels, 0.933 mins elapsed.
2025-10-31 17:53:09.737132 : Block (2 of 6) : Seurat TransferData GeneMatrix, 0.967 mins elapsed.
2025-10-31 17:53:11.580403 : Block (2 of 6) : Saving TransferAnchors Joint CCA, 0.998 mins elapsed.
2025-10-31 17:53:12.67559 : Block (2 of 6) : Transferring Paired RNA to Temp File, 1.016 mins elapsed.
2025-10-31 17:53:13.669618 : Block (2 of 6) : Completed Integration, 1.033 mins elapsed.
2025-10-31 17:53:14.562255 : Block (3 of 6) : Computing Integration, 1.048 mins elapsed.
2025-10-31 17:53:16.014731 : Block (3 of 6) : Identifying Variable Genes, 1.072 mins elapsed.
2025-10-31 17:53:16.996151 : Block (3 of 6) : Getting GeneScoreMatrix, 1.088 mins elapsed.
2025-10-31 17:53:21.556036 : Block (3 of 6) : Imputing GeneScoreMatrix, 1.164 mins elapsed.
Getting ImputeWeights
2025-10-31 17:53:22.715645 : Block (3 of 6) : Seurat FindTransferAnchors, 1.184 mins elapsed.
2025-10-31 17:53:25.507706 : Block (3 of 6) : Seurat TransferData Cell Group Labels, 1.23 mins elapsed.
2025-10-31 17:53:25.609681 : Block (3 of 6) : Seurat TransferData Cell Names Labels, 1.232 mins elapsed.
2025-10-31 17:53:25.739693 : Block (3 of 6) : Seurat TransferData GeneMatrix, 1.234 mins elapsed.
2025-10-31 17:53:26.971188 : Block (3 of 6) : Saving TransferAnchors Joint CCA, 1.255 mins elapsed.
2025-10-31 17:53:27.904337 : Block (3 of 6) : Transferring Paired RNA to Temp File, 1.27 mins elapsed.
2025-10-31 17:53:28.311682 : Block (3 of 6) : Completed Integration, 1.277 mins elapsed.
2025-10-31 17:53:29.211357 : Block (4 of 6) : Computing Integration, 1.292 mins elapsed.
2025-10-31 17:53:30.948436 : Block (4 of 6) : Identifying Variable Genes, 1.321 mins elapsed.
2025-10-31 17:53:32.144675 : Block (4 of 6) : Getting GeneScoreMatrix, 1.341 mins elapsed.
2025-10-31 17:53:37.116117 : Block (4 of 6) : Imputing GeneScoreMatrix, 1.424 mins elapsed.
Getting ImputeWeights
2025-10-31 17:53:38.510089 : Block (4 of 6) : Seurat FindTransferAnchors, 1.447 mins elapsed.
2025-10-31 17:53:43.357619 : Block (4 of 6) : Seurat TransferData Cell Group Labels, 1.528 mins elapsed.
2025-10-31 17:53:43.548175 : Block (4 of 6) : Seurat TransferData Cell Names Labels, 1.531 mins elapsed.
2025-10-31 17:53:43.915248 : Block (4 of 6) : Seurat TransferData GeneMatrix, 1.537 mins elapsed.
2025-10-31 17:53:45.41506 : Block (4 of 6) : Saving TransferAnchors Joint CCA, 1.562 mins elapsed.
2025-10-31 17:53:46.37791 : Block (4 of 6) : Transferring Paired RNA to Temp File, 1.578 mins elapsed.
2025-10-31 17:53:47.148634 : Block (4 of 6) : Completed Integration, 1.591 mins elapsed.
2025-10-31 17:53:48.042885 : Block (5 of 6) : Computing Integration, 1.606 mins elapsed.
2025-10-31 17:53:51.579289 : Block (5 of 6) : Identifying Variable Genes, 1.665 mins elapsed.
2025-10-31 17:53:54.128001 : Block (5 of 6) : Getting GeneScoreMatrix, 1.707 mins elapsed.
2025-10-31 17:53:58.865273 : Block (5 of 6) : Imputing GeneScoreMatrix, 1.786 mins elapsed.
Getting ImputeWeights
2025-10-31 17:53:59.941153 : Block (5 of 6) : Seurat FindTransferAnchors, 1.804 mins elapsed.
2025-10-31 17:54:08.924143 : Block (5 of 6) : Seurat TransferData Cell Group Labels, 1.954 mins elapsed.
2025-10-31 17:54:08.996598 : Block (5 of 6) : Seurat TransferData Cell Names Labels, 1.955 mins elapsed.
2025-10-31 17:54:10.218672 : Block (5 of 6) : Seurat TransferData GeneMatrix, 1.975 mins elapsed.
2025-10-31 17:54:11.452084 : Block (5 of 6) : Saving TransferAnchors Joint CCA, 1.996 mins elapsed.
2025-10-31 17:54:12.566777 : Block (5 of 6) : Transferring Paired RNA to Temp File, 2.014 mins elapsed.
2025-10-31 17:54:12.859383 : Block (5 of 6) : Completed Integration, 2.019 mins elapsed.
2025-10-31 17:54:13.758763 : Block (6 of 6) : Computing Integration, 2.034 mins elapsed.
2025-10-31 17:54:15.916436 : Block (6 of 6) : Identifying Variable Genes, 2.07 mins elapsed.
2025-10-31 17:54:17.411086 : Block (6 of 6) : Getting GeneScoreMatrix, 2.095 mins elapsed.
2025-10-31 17:54:24.692338 : Block (6 of 6) : Imputing GeneScoreMatrix, 2.217 mins elapsed.
Getting ImputeWeights
2025-10-31 17:54:30.731245 : Block (6 of 6) : Seurat FindTransferAnchors, 2.317 mins elapsed.
2025-10-31 17:54:38.558136 : Block (6 of 6) : Seurat TransferData Cell Group Labels, 2.448 mins elapsed.
2025-10-31 17:54:39.653628 : Block (6 of 6) : Seurat TransferData Cell Names Labels, 2.466 mins elapsed.
2025-10-31 17:54:42.022363 : Block (6 of 6) : Seurat TransferData GeneMatrix, 2.505 mins elapsed.
2025-10-31 17:54:45.889776 : Block (6 of 6) : Saving TransferAnchors Joint CCA, 2.57 mins elapsed.
2025-10-31 17:54:46.986678 : Block (6 of 6) : Transferring Paired RNA to Temp File, 2.588 mins elapsed.
2025-10-31 17:54:50.423219 : Block (6 of 6) : Completed Integration, 2.645 mins elapsed.
2025-10-31 17:54:51.406242 : Block (1 of 6) : Plotting Joint UMAP, 2.662 mins elapsed.
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
2025-10-31 17:55:18.317399 : Block (2 of 6) : Plotting Joint UMAP, 3.11 mins elapsed.
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
2025-10-31 17:55:42.952962 : Block (3 of 6) : Plotting Joint UMAP, 3.521 mins elapsed.
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
2025-10-31 17:55:51.823908 : Block (4 of 6) : Plotting Joint UMAP, 3.669 mins elapsed.
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
2025-10-31 17:56:07.015005 : Block (5 of 6) : Plotting Joint UMAP, 3.922 mins elapsed.
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
2025-10-31 17:56:31.598351 : Block (6 of 6) : Plotting Joint UMAP, 4.332 mins elapsed.
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
Also defined by ‘spam’
2025-10-31 17:57:03.055419 : Transferring Data to ArrowFiles, 4.856 mins elapsed.
2025-10-31 17:57:03.0737 : fetal_eye_except_retina (1 of 2) Getting GeneIntegrationMatrix From TempFiles!, 4.856 mins elapsed.
2025-10-31 17:57:10.514187 : fetal_eye_except_retina (1 of 2) Adding GeneIntegrationMatrix to ArrowFile!, 4.98 mins elapsed.
2025-10-31 17:58:04.15675 : fetal_retina (2 of 2) Getting GeneIntegrationMatrix From TempFiles!, 5.874 mins elapsed.
2025-10-31 17:58:05.806291 : fetal_retina (2 of 2) Adding GeneIntegrationMatrix to ArrowFile!, 5.902 mins elapsed.
2025-10-31 17:58:39.959272 : Completed Integration with RNA Matrix, 6.471 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-addGeneIntegrationMatrix-91a864e5a7da-Date-2025-10-31_Time-17-52-05.274252.log
code cell
In [70]:
p13 <- plotEmbedding(
ArchRProj = projHeme3,
colorBy = "GeneIntegrationMatrix",
name = markerGenes,
continuousSet = "horizonExtra",
embedding = "UMAP",
imputeWeights = getImputeWeights(projHeme3)
)
p14 <- plotEmbedding(
ArchRProj = projHeme3,
colorBy = "GeneScoreMatrix",
continuousSet = "horizonExtra",
name = markerGenes,
embedding = "UMAP",
imputeWeights = getImputeWeights(projHeme3)
)
Output:
Getting ImputeWeights
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a8dd421cb-Date-2025-10-31_Time-17-58-40.089852.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = GeneIntegrationMatrix
Getting Matrix Values...
2025-10-31 17:58:40.651754 :
Imputing Matrix
Using weights on disk
1 of 1
Using weights on disk
1 of 1
Plotting Embedding
1
2
3
4
5
6
7
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a8dd421cb-Date-2025-10-31_Time-17-58-40.089852.log
Getting ImputeWeights
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a847797697-Date-2025-10-31_Time-17-58-48.481978.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = GeneScoreMatrix
Getting Matrix Values...
2025-10-31 17:58:49.094289 :
Imputing Matrix
Using weights on disk
1 of 1
Using weights on disk
1 of 1
Plotting Embedding
1
2
3
4
5
6
7
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a847797697-Date-2025-10-31_Time-17-58-48.481978.log
code cell
In [71]:
#Labeling scATAC-seq clusters with scRNA-seq information
cM <- confusionMatrix(projHeme3$Clusters, projHeme3$predictedGroup)
labelOld <- rownames(cM)
labelOld
- 'C3'
- 'C1'
- 'C12'
- 'C5'
- 'C10'
- 'C8'
- 'C11'
- 'C4'
- 'C6'
- 'C7'
- 'C14'
- 'C13'
- 'C15'
- 'C2'
- 'C9'
code cell
In [72]:
labelNew <- colnames(cM)[apply(cM, 1, which.max)]
labelNew
- 'Rods'
- 'RPE'
- 'ACs'
- 'Rods'
- 'Fibroblasts'
- 'Fibroblasts'
- 'Fibroblasts'
- 'Rods'
- 'BCs'
- 'Microphages'
- 'RPE'
- 'Fibroblasts'
- 'ACs'
- 'Cones'
- 'RPCs'
code cell
In [73]:
projHeme3$Clusters2 <- mapLabels(projHeme3$Clusters, newLabels = labelNew, oldLabels = labelOld)
code cell
In [74]:
p15 <- plotEmbedding(projHeme3, colorBy = "cellColData", name = "Clusters2")
p15
Output:
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a848c0abc6-Date-2025-10-31_Time-17-59-46.285521.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = cellColData
Plotting Embedding
1
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a848c0abc6-Date-2025-10-31_Time-17-59-46.285521.log
code cell
In [75]:
projHeme3 <- saveArchRProject(ArchRProj = projHeme3, outputDirectory = "Save-ProjHeme3", load = TRUE)
Output:
Copying ArchRProject to new outputDirectory : /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme3
Copying Arrow Files...
Copying Arrow Files (1 of 2)
Copying Arrow Files (2 of 2)
Getting ImputeWeights
Dropping ImputeWeights...
Copying Other Files...
Copying Other Files (1 of 4): Embeddings
Copying Other Files (2 of 4): IterativeLSI
Copying Other Files (3 of 4): Plots
Copying Other Files (4 of 4): RNAIntegration
Saving ArchRProject...
Loading ArchRProject...
Successfully loaded ArchRProject!
/ |
/ \
. / |.
\\\ / |.
\\\ / `|.
\\\ / |.
\ / |\
\\#####\ / ||
==###########> / ||
\\##==......\ / ||
______ = =|__ /__ || \\\
,--' ,----`-,__ ___/' --,-`-===================##========>
\ ' ##_______ _____ ,--,__,=##,__ ///
, __== ___,-,__,--'#' ===' `-' | ##,-/
-,____,---' \\####\\________________,--\\_##,/
___ .______ ______ __ __ .______
/ \ | _ \ / || | | | | _ \
/ ^ \ | |_) | | ,----'| |__| | | |_) |
/ /_\ \ | / | | | __ | | /
/ _____ \ | |\ \\___ | `----.| | | | | |\ \\___.
/__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
code cell
rm(projHeme2)
gc()
markdown cell
9. Making Pseudo-bulk Replicates
code cell
In [77]:
library(BSgenome.Hsapiens.UCSC.hg38)
projHeme4 <- addGroupCoverages(ArchRProj = projHeme3, groupBy = "Clusters2")
Output:
Loading required package: BSgenome
Loading required package: BiocIO
Loading required package: rtracklayer
Warning message:
“package ‘rtracklayer’ was built under R version 4.3.3”
Attaching package: ‘rtracklayer’
The following object is masked from ‘package:BiocIO’:
FileForFormat
ArchR logging to : ArchRLogs/ArchR-addGroupCoverages-91a856f6d087-Date-2025-10-31_Time-18-11-20.553077.log
If there is an issue, please report to github with logFile!
ACs (1 of 8) : CellGroups N = 2
BCs (2 of 8) : CellGroups N = 2
Cones (3 of 8) : CellGroups N = 2
Fibroblasts (4 of 8) : CellGroups N = 2
Microphages (5 of 8) : CellGroups N = 2
Rods (6 of 8) : CellGroups N = 2
RPCs (7 of 8) : CellGroups N = 2
RPE (8 of 8) : CellGroups N = 2
2025-10-31 18:11:24.733634 : Creating Coverage Files!, 0.07 mins elapsed.
2025-10-31 18:11:24.736996 : Batch Execution w/ safelapply!, 0.07 mins elapsed.
2025-10-31 18:11:24.741147 : Group ACs._.fetal_eye_except_retina (1 of 16) : Creating Group Coverage File : ACs._.fetal_eye_except_retina.insertions.coverage.h5, 0.07 mins elapsed.
Number of Cells = 500
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:11:54.465145 : Group ACs._.fetal_retina (2 of 16) : Creating Group Coverage File : ACs._.fetal_retina.insertions.coverage.h5, 0.565 mins elapsed.
Number of Cells = 179
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:12:28.908771 : Group BCs._.fetal_retina (3 of 16) : Creating Group Coverage File : BCs._.fetal_retina.insertions.coverage.h5, 1.139 mins elapsed.
Number of Cells = 136
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:12:57.681518 : Group BCs._.fetal_eye_except_retina (4 of 16) : Creating Group Coverage File : BCs._.fetal_eye_except_retina.insertions.coverage.h5, 1.619 mins elapsed.
Number of Cells = 117
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:13:25.117063 : Group Cones._.Rep1 (5 of 16) : Creating Group Coverage File : Cones._.Rep1.insertions.coverage.h5, 2.076 mins elapsed.
Number of Cells = 40
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:13:55.259366 : Group Cones._.Rep2 (6 of 16) : Creating Group Coverage File : Cones._.Rep2.insertions.coverage.h5, 2.578 mins elapsed.
Number of Cells = 40
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:14:25.564667 : Group Fibroblasts._.fetal_eye_except_retina (7 of 16) : Creating Group Coverage File : Fibroblasts._.fetal_eye_except_retina.insertions.coverage.h5, 3.084 mins elapsed.
Number of Cells = 500
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:14:56.844514 : Group Fibroblasts._.fetal_retina (8 of 16) : Creating Group Coverage File : Fibroblasts._.fetal_retina.insertions.coverage.h5, 3.605 mins elapsed.
Number of Cells = 70
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:15:23.53514 : Group Microphages._.Rep1 (9 of 16) : Creating Group Coverage File : Microphages._.Rep1.insertions.coverage.h5, 4.05 mins elapsed.
Number of Cells = 266
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:15:51.290942 : Group Microphages._.Rep2 (10 of 16) : Creating Group Coverage File : Microphages._.Rep2.insertions.coverage.h5, 4.512 mins elapsed.
Number of Cells = 40
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:16:20.576663 : Group Rods._.fetal_retina (11 of 16) : Creating Group Coverage File : Rods._.fetal_retina.insertions.coverage.h5, 5 mins elapsed.
Number of Cells = 500
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:16:54.15746 : Group Rods._.fetal_eye_except_retina (12 of 16) : Creating Group Coverage File : Rods._.fetal_eye_except_retina.insertions.coverage.h5, 5.56 mins elapsed.
Number of Cells = 435
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:17:24.060377 : Group RPCs._.Rep1 (13 of 16) : Creating Group Coverage File : RPCs._.Rep1.insertions.coverage.h5, 6.058 mins elapsed.
Number of Cells = 134
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:17:50.501192 : Group RPCs._.Rep2 (14 of 16) : Creating Group Coverage File : RPCs._.Rep2.insertions.coverage.h5, 6.499 mins elapsed.
Number of Cells = 40
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:18:19.122296 : Group RPE._.fetal_eye_except_retina (15 of 16) : Creating Group Coverage File : RPE._.fetal_eye_except_retina.insertions.coverage.h5, 6.976 mins elapsed.
Number of Cells = 500
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:18:48.850879 : Group RPE._.fetal_retina (16 of 16) : Creating Group Coverage File : RPE._.fetal_retina.insertions.coverage.h5, 7.472 mins elapsed.
Number of Cells = 69
Coverage File Exists!
Added Coverage Group
Added Metadata Group
Added ArrowCoverage Class
Added Coverage/Info
Added Coverage/Info/CellNames
2025-10-31 18:19:14.971433 : Adding Kmer Bias to Coverage Files!, 7.907 mins elapsed.
Output:
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
Output:
Completed Kmer Bias Calculation
Adding Kmer Bias (1 of 16)
Adding Kmer Bias (2 of 16)
Adding Kmer Bias (3 of 16)
Adding Kmer Bias (4 of 16)
Adding Kmer Bias (5 of 16)
Adding Kmer Bias (6 of 16)
Adding Kmer Bias (7 of 16)
Adding Kmer Bias (8 of 16)
Adding Kmer Bias (9 of 16)
Adding Kmer Bias (10 of 16)
Adding Kmer Bias (11 of 16)
Adding Kmer Bias (12 of 16)
Adding Kmer Bias (13 of 16)
Adding Kmer Bias (14 of 16)
Adding Kmer Bias (15 of 16)
Adding Kmer Bias (16 of 16)
2025-10-31 18:20:10.540152 : Finished Creation of Coverage Files!, 8.833 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-addGroupCoverages-91a856f6d087-Date-2025-10-31_Time-18-11-20.553077.log
code cell
In [78]:
projHeme4@projectMetadata$GroupCoverages$Clusters2$coverageMetadata
Result:
DataFrame with 16 rows and 5 columns
Group Name File nCells
<character> <character> <character> <integer>
1 ACs ACs._.fetal_eye_exce.. /share2/pub/zhouyj/z.. 500
2 ACs ACs._.fetal_retina /share2/pub/zhouyj/z.. 179
3 BCs BCs._.fetal_retina /share2/pub/zhouyj/z.. 136
4 BCs BCs._.fetal_eye_exce.. /share2/pub/zhouyj/z.. 117
5 Cones Cones._.Rep1 /share2/pub/zhouyj/z.. 40
... ... ... ... ...
12 Rods Rods._.fetal_eye_exc.. /share2/pub/zhouyj/z.. 435
13 RPCs RPCs._.Rep1 /share2/pub/zhouyj/z.. 134
14 RPCs RPCs._.Rep2 /share2/pub/zhouyj/z.. 40
15 RPE RPE._.fetal_eye_exce.. /share2/pub/zhouyj/z.. 500
16 RPE RPE._.fetal_retina /share2/pub/zhouyj/z.. 69
nInsertions
<numeric>
1 4440076
2 2864548
3 4201458
4 1642106
5 2234234
... ...
12 10706980
13 825818
14 200906
15 10886846
16 2480500
code cell
In [79]:
groups <- addGroupCoverages(ArchRProj = projHeme3, groupBy = "Clusters2", returnGroups = TRUE)
Output:
ArchR logging to : ArchRLogs/ArchR-addGroupCoverages-91a848f6d48c-Date-2025-10-31_Time-18-49-00.131534.log
If there is an issue, please report to github with logFile!
ACs (1 of 8) : CellGroups N = 2
BCs (2 of 8) : CellGroups N = 2
Cones (3 of 8) : CellGroups N = 2
Fibroblasts (4 of 8) : CellGroups N = 2
Microphages (5 of 8) : CellGroups N = 2
Rods (6 of 8) : CellGroups N = 2
RPCs (7 of 8) : CellGroups N = 2
RPE (8 of 8) : CellGroups N = 2
markdown cell
10. Calling Peaks
code cell
In [81]:
pathToMacs2 <- findMacs2()
pathToMacs2
Output:
Searching For MACS2..
Found with $PATH at /share/pub/zhouyj/anaconda3/envs/scRNA/bin/macs2
'macs2'
code cell
In [82]:
projHeme4 <- addReproduciblePeakSet(
ArchRProj = projHeme4,
groupBy = "Clusters2",
pathToMacs2 = pathToMacs2
)
Output:
ArchR logging to : ArchRLogs/ArchR-addReproduciblePeakSet-91a81db437b6-Date-2025-10-31_Time-19-02-54.509429.log
If there is an issue, please report to github with logFile!
Calling Peaks with Macs2
2025-10-31 19:02:56.815876 : Peak Calling Parameters!, 0.038 mins elapsed.
Output:
Group nCells nCellsUsed nReplicates nMin nMax maxPeaks
ACs ACs 861 679 2 179 500 150000
BCs BCs 253 253 2 117 136 126500
Cones Cones 69 62 2 40 40 31000
Fibroblasts Fibroblasts 2859 570 2 70 500 150000
Microphages Microphages 306 306 2 40 266 150000
Rods Rods 1041 935 2 435 500 150000
RPCs RPCs 174 174 2 40 134 87000
RPE RPE 622 569 2 69 500 150000
Output:
2025-10-31 19:02:56.825329 : Batching Peak Calls!, 0.039 mins elapsed.
2025-10-31 19:02:56.83453 : Batch Execution w/ safelapply!, 0 mins elapsed.
2025-10-31 19:04:29.351633 : Identifying Reproducible Peaks!, 1.581 mins elapsed.
Output:
ted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
R_zmq_msg_send errno: 4 strerror: Interrupted system call
Output:
2025-10-31 19:11:29.996123 : Creating Union Peak Set!, 8.591 mins elapsed.
Converged after 6 iterations!
Plotting Ggplot!
2025-10-31 19:22:42.27675 : Finished Creating Union Peak Set (129092)!, 19.796 mins elapsed.
code cell
In [83]:
list.files(path=paste0(getOutputDirectory(ArchRProj = projHeme4),"/PeakCalls"))
code cell
In [84]:
getPeakSet(projHeme4)
Result:
GRanges object with 129092 ranges and 13 metadata columns:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <numeric>
BCs chr1 794305-794805 * | 6.39722
Fibroblasts chr1 826534-827034 * | 13.74890
Rods chr1 827309-827809 * | 157.73900
RPCs chr1 850196-850696 * | 3.37943
Rods chr1 869681-870181 * | 257.91700
... ... ... ... . ...
Microphages chrX 155632362-155632862 * | 6.60554
Rods chrX 155767346-155767846 * | 33.39460
Rods chrX 155820060-155820560 * | 28.29440
BCs chrX 155880528-155881028 * | 5.89683
Cones chrX 155881055-155881555 * | 24.78860
replicateScoreQuantile groupScoreQuantile Reproducibility
<numeric> <numeric> <numeric>
BCs 0.645 0.303 2
Fibroblasts 0.569 0.199 2
Rods 0.907 0.856 2
RPCs 0.586 0.193 2
Rods 0.897 0.841 2
... ... ... ...
Microphages 0.618 0.195 2
Rods 0.743 0.602 2
Rods 0.606 0.392 2
BCs 0.358 0.081 2
Cones 0.915 0.859 2
GroupReplicate distToGeneStart nearestGene peakType
<character> <integer> <character> <character>
BCs BCs._.fetal_eye_exce.. 22815 FAM87B Distal
Fibroblasts Fibroblasts._.fetal_.. 737 LINC00115 Exonic
Rods Rods._.fetal_eye_exc.. 36 LINC00115 Promoter
RPCs RPCs._.Rep2 22923 LINC00115 Intronic
Rods Rods._.fetal_retina 6971 FAM41C Intronic
... ... ... ... ...
Microphages Microphages._.Rep1 37331 TMLHE Intronic
Rods Rods._.fetal_eye_exc.. 97651 TMLHE Distal
Rods Rods._.fetal_retina 150365 TMLHE Distal
BCs BCs._.fetal_retina 210833 TMLHE Distal
Cones Cones._.Rep2 211360 TMLHE Distal
distToTSS nearestTSS GC idx N
<integer> <character> <numeric> <integer> <numeric>
BCs 8754 uc057awx.1 0.4810 1 0
Fibroblasts 47 uc057axb.1 0.5150 2 0
Rods 36 uc010nxx.3 0.6966 3 0
RPCs 901 uc057axk.1 0.4731 4 0
Rods 269 uc057axl.1 0.7405 5 0
... ... ... ... ... ...
Microphages 19675 uc004fnn.5 0.4271 2466 0
Rods 215 uc004fnq.2 0.5409 2467 0
Rods 52497 uc004fnq.2 0.4671 2468 0
BCs 514 uc004fnr.4 0.4212 2469 0
Cones 11 uc004fnr.4 0.6287 2470 0
-------
seqinfo: 23 sequences from an unspecified genome; no seqlengths
code cell
In [85]:
projHeme4 <- saveArchRProject(ArchRProj = projHeme4, outputDirectory = "Save-ProjHeme4", load = TRUE)
Output:
Copying ArchRProject to new outputDirectory : /share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme4
Copying Arrow Files...
Copying Arrow Files (1 of 2)
Copying Arrow Files (2 of 2)
Getting ImputeWeights
No imputeWeights found, returning NULL
Copying Other Files...
Copying Other Files (1 of 6): Embeddings
Copying Other Files (2 of 6): GroupCoverages
Copying Other Files (3 of 6): IterativeLSI
Copying Other Files (4 of 6): PeakCalls
Copying Other Files (5 of 6): Plots
Copying Other Files (6 of 6): RNAIntegration
Saving ArchRProject...
Loading ArchRProject...
Successfully loaded ArchRProject!
/ |
/ \
. / |.
\\\ / |.
\\\ / `|.
\\\ / |.
\ / |\
\\#####\ / ||
==###########> / ||
\\##==......\ / ||
______ = =|__ /__ || \\\
,--' ,----`-,__ ___/' --,-`-===================##========>
\ ' ##_______ _____ ,--,__,=##,__ ///
, __== ___,-,__,--'#' ===' `-' | ##,-/
-,____,---' \\####\\________________,--\\_##,/
___ .______ ______ __ __ .______
/ \ | _ \ / || | | | | _ \
/ ^ \ | |_) | | ,----'| |__| | | |_) |
/ /_\ \ | / | | | __ | | /
/ _____ \ | |\ \\___ | `----.| | | | | |\ \\___.
/__/ \__\ | _| `._____| \______||__| |__| | _| `._____|
code cell
In [86]:
projHeme5 <- addPeakMatrix(projHeme4)
getOutputDirectory(ArchRProj = projHeme5)
getAvailableMatrices(projHeme5)
Output:
ArchR logging to : ArchRLogs/ArchR-addPeakMatrix-91a8539812a2-Date-2025-10-31_Time-19-22-50.804515.log
If there is an issue, please report to github with logFile!
2025-10-31 19:22:55.422745 : Batch Execution w/ safelapply!, 0 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-addPeakMatrix-91a8539812a2-Date-2025-10-31_Time-19-22-50.804515.log
'/share2/pub/zhouyj/zhouyj/Human_eye_atlas/analysis/demo/Save-ProjHeme4'
- 'GeneIntegrationMatrix'
- 'GeneScoreMatrix'
- 'PeakMatrix'
- 'TileMatrix'
code cell
In [87]:
rm(projHeme4)
gc()
A matrix: 2 × 6 of type dbl
| used | (Mb) | gc trigger | (Mb) | max used | (Mb) |
| Ncells | 12339869 | 659.1 | 21390551 | 1142.4 | 21390551 | 1142.4 |
| Vcells | 646043407 | 4929.0 | 1848418318 | 14102.4 | 1777880515 | 13564.2 |
markdown cell
11. Identifying Marker Peaks
code cell
In [88]:
#Identify
#Between groups by set useGroups and bgdGroups
markerPeaks <- getMarkerFeatures(
ArchRProj = projHeme5,
useMatrix = "PeakMatrix",
groupBy = "Clusters2",
bias = c("TSSEnrichment", "log10(nFrags)"),
testMethod = "wilcoxon"
)
Output:
ArchR logging to : ArchRLogs/ArchR-getMarkerFeatures-91a823dd57ca-Date-2025-10-31_Time-19-24-19.361271.log
If there is an issue, please report to github with logFile!
MatrixClass = Sparse.Integer.Matrix
2025-10-31 19:24:19.984723 : Matching Known Biases, 0.004 mins elapsed.
###########
2025-10-31 19:25:00.011829 : Completed Pairwise Tests, 0.671 mins elapsed.
###########
ArchR logging successful to : ArchRLogs/ArchR-getMarkerFeatures-91a823dd57ca-Date-2025-10-31_Time-19-24-19.361271.log
code cell
In [89]:
markerPeaks
Result:
class: SummarizedExperiment
dim: 129092 8
metadata(2): MatchInfo Params
assays(7): Log2FC Mean ... AUC MeanBGD
rownames(129092): 1 2 ... 129091 129092
rowData names(4): seqnames idx start end
colnames(8): ACs BCs ... RPCs RPE
colData names(0):
code cell
In [109]:
markerList <- getMarkers(markerPeaks, cutOff = "FDR <= 0.05 & Log2FC >= 1")
head(markerList$ACs)
Result:
DataFrame with 6 rows and 7 columns
seqnames idx start end Log2FC FDR MeanDiff
<Rle> <integer> <integer> <integer> <numeric> <numeric> <numeric>
2757 chr1 2757 26542185 26542685 1.19779 0.00135953 0.238590
81998 chr22 2747 46674339 46674839 6.88919 0.00135953 0.236970
12152 chr1 12152 236065000 236065500 1.61085 0.00161278 0.477419
56006 chr18 104 3177818 3178318 7.45332 0.00181553 0.351322
60755 chr19 2136 16258413 16258913 6.69128 0.00247413 0.206335
105726 chr6 3975 89035754 89036254 1.05297 0.00339361 0.201822
code cell
In [91]:
#Plotting marker peaks
heatmapPeaks <- markerHeatmap(
seMarker = markerPeaks,
cutOff = "FDR <= 0.1 & Log2FC >= 0.5",
transpose = TRUE
)
Output:
Warning message:
“'markerHeatmap' is deprecated.
Use 'plotMarkerHeatmap' instead.
See help("Deprecated")”
ArchR logging to : ArchRLogs/ArchR-plotMarkerHeatmap-91a87adddb1c-Date-2025-10-31_Time-19-25-03.501813.log
If there is an issue, please report to github with logFile!
Identified 53861 markers!
Output:
[1] "chr1:15345799-15346299" "chr1:22652905-22653405"
[3] "chr1:22662317-22662817" "chr1:25247315-25247815"
[5] "chr1:26542185-26542685" "chr1:30756609-30757109"
[7] "chr1:31789140-31789640" "chr1:32981000-32981500"
[9] "chr1:40385615-40386115" "chr1:40653829-40654329"
[11] "chr1:44031289-44031789" "chr1:50251731-50252231"
[13] "chr1:59292570-59293070" "chr1:81800172-81800672"
[15] "chr1:85201486-85201986" "chr1:1426870-1427370"
[17] "chr1:3529302-3529802" "chr1:4760380-4760880"
[19] "chr1:5287664-5288164" "chr1:6145100-6145600"
[21] "chr1:7052660-7053160" "chr1:7427476-7427976"
[23] "chr1:9883898-9884398" "chr1:18134656-18135156"
[25] "chr1:18142896-18143396" "chr1:18182749-18183249"
[27] "chr1:18190460-18190960" "chr1:19028006-19028506"
[29] "chr1:19640442-19640942" "chr1:19730618-19731118"
[31] "chr12:32877591-32878091" "chr5:168291370-168291870"
[33] "chr1:39562818-39563318" "chr1:180188651-180189151"
[35] "chr1:201941573-201942073" "chr1:234779538-234780038"
[37] "chr10:118965616-118966116" "chr12:3139801-3140301"
[39] "chr12:109715428-109715928" "chr14:66649464-66649964"
[41] "chr14:92886342-92886842" "chr14:99540242-99540742"
[43] "chr15:34754777-34755277" "chr16:20866167-20866667"
[45] "chr16:85431694-85432194" "chr1:916454-916954"
[47] "chr1:940711-941211" "chr1:1013184-1013684"
[49] "chr1:1014283-1014783" "chr1:1019878-1020378"
[51] "chr1:1349671-1350171" "chr1:1352586-1353086"
[53] "chr1:1356999-1357499" "chr1:1357783-1358283"
[55] "chr1:1358361-1358861" "chr1:1359159-1359659"
[57] "chr1:1359675-1360175" "chr1:1360521-1361021"
[59] "chr1:1361929-1362429" "chr1:1435436-1435936"
[61] "chr1:1433751-1434251" "chr1:1713451-1713951"
[63] "chr1:1858593-1859093" "chr1:2148658-2149158"
[65] "chr1:2255410-2255910" "chr1:2256059-2256559"
[67] "chr1:2548636-2549136" "chr1:2578271-2578771"
[69] "chr1:5698062-5698562" "chr1:6935302-6935802"
[71] "chr1:7491405-7491905" "chr1:7543022-7543522"
[73] "chr1:7648968-7649468" "chr1:7760222-7760722"
[75] "chr1:9239591-9240091" "chr1:869681-870181"
[77] "chr1:909902-910402" "chr1:911146-911646"
[79] "chr1:911996-912496" "chr1:932831-933331"
[81] "chr1:935223-935723" "chr1:935772-936272"
[83] "chr1:938022-938522" "chr1:939048-939548"
[85] "chr1:939567-940067" "chr1:941591-942091"
[87] "chr1:942299-942799" "chr1:945174-945674"
[89] "chr1:947850-948350" "chr1:951304-951804"
[91] "chr1:924788-925288" "chr1:925411-925911"
[93] "chr1:933396-933896" "chr1:991895-992395"
[95] "chr1:1025031-1025531" "chr1:1059407-1059907"
[97] "chr1:1079460-1079960" "chr1:1158106-1158606"
[99] "chr1:1422879-1423379" "chr1:1433030-1433530"
[101] "chr1:1599206-1599706" "chr1:1891492-1891992"
[103] "chr1:2025415-2025915" "chr1:2248828-2249328"
[105] "chr1:2257264-2257764" "chr1:2481980-2482480"
[107] "chr1:2527481-2527981"
Output:
Adding Annotations..
Preparing Main Heatmap..
ArchR logging successful to : ArchRLogs/ArchR-plotMarkerHeatmap-91a87adddb1c-Date-2025-10-31_Time-19-25-03.501813.log
code cell
In [92]:
draw(heatmapPeaks, heatmap_legend_side = "bot", annotation_legend_side = "bot")
code cell
In [110]:
#Volcano plots
pma <- plotMarkers(seMarker = markerPeaks, name = "ACs", cutOff = "FDR <= 0.1 & Log2FC >= 1", plotAs = "MA")
pma
pv <- plotMarkers(seMarker = markerPeaks, name = "ACs", cutOff = "FDR <= 0.1 & Log2FC >= 1", plotAs = "Volcano")
pv
Output:
Warning message:
“[1m[22mRemoved 28 rows containing missing values or values outside the scale range (`geom_point_rast()`).”
Warning message:
“[1m[22mRemoved 28 rows containing missing values or values outside the scale range (`geom_point_rast()`).”
code cell
In [118]:
#Marker Peaks in Browser Tracks
#markerGenes <- c(
# "CCND1", ##RPCs
# "ATOH7", ##NPCs
# "GAP43", ##RGCs
# "ONECUT1", ##HCs
# "TFAP2A", ##ACs
# "PRDM1", ##Cones
# "NRL" ##Rods
# )
p16 <- plotBrowserTrack(
ArchRProj = projHeme5,
groupBy = "Clusters2",
geneSymbol = c('CCND1'),
features = getMarkers(markerPeaks, cutOff = "FDR <= 0.1 & Log2FC >= 1", returnGR = TRUE)[['RPCs']],
upstream = 50000,
downstream = 50000
)
plotPDF(p16, name = "Plot-Tracks-With-Features", width = 5, height = 5, ArchRProj = projHeme5, addDOC = FALSE)
Output:
ArchR logging to : ArchRLogs/ArchR-plotBrowserTrack-91a82ef79af8-Date-2025-10-31_Time-19-42-43.092084.log
If there is an issue, please report to github with logFile!
2025-10-31 19:42:43.416677 : Validating Region, 0.005 mins elapsed.
Output:
GRanges object with 1 range and 2 metadata columns:
seqnames ranges strand | gene_id symbol
<Rle> <IRanges> <Rle> | <character> <character>
[1] chr11 69641087-69654474 + | 595 CCND1
-------
seqinfo: 24 sequences from hg38 genome
Output:
2025-10-31 19:42:43.514388 : Adding Bulk Tracks (1 of 1), 0.007 mins elapsed.
2025-10-31 19:42:44.557288 : Adding Feature Tracks (1 of 1), 0.024 mins elapsed.
2025-10-31 19:42:44.622154 : Adding Gene Tracks (1 of 1), 0.026 mins elapsed.
2025-10-31 19:42:44.850726 : Plotting, 0.029 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-plotBrowserTrack-91a82ef79af8-Date-2025-10-31_Time-19-42-43.092084.log
Plotting Gtable!
Output:
NULL
_msg_send errno: 4 strerror: Interrupted system call
code cell
In [119]:
grid::grid.draw(p16$CCND1)
markdown cell
12. Motif and Feature Enrichment
code cell
In [120]:
projHeme5 <- addMotifAnnotations(ArchRProj = projHeme5, motifSet = "cisbp", name = "Motif")
Output:
ArchR logging to : ArchRLogs/ArchR-addMotifAnnotations-91a8121a56c-Date-2025-10-31_Time-19-43-03.340663.log
If there is an issue, please report to github with logFile!
2025-10-31 19:43:08.924174 : Getting Motif Set, Species : Homo sapiens, 0.005 mins elapsed.
Using version 2 motifs!
2025-10-31 19:43:10.724987 : Finding Motif Positions with motifmatchr!, 0.035 mins elapsed.
2025-10-31 19:53:01.423081 : All Motifs Overlap at least 1 peak!, 9.88 mins elapsed.
2025-10-31 19:53:01.431496 : Creating Motif Overlap Matrix, 9.88 mins elapsed.
2025-10-31 19:53:03.098856 : Finished Getting Motif Info!, 9.908 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-addMotifAnnotations-91a8121a56c-Date-2025-10-31_Time-19-43-03.340663.log
code cell
In [121]:
#Find motif for specific gene
#Make chr:star-end for peaks
pSet <- getPeakSet(ArchRProj = projHeme5)
pSet$name <- paste(seqnames(pSet), start(pSet), end(pSet), sep = "_")
code cell
In [122]:
#Find matched motifs
matches <- getMatches(ArchRProj = projHeme5, name = "Motif")
rownames(matches) <- paste(seqnames(matches), start(matches), end(matches), sep = "_")
matches <- matches[pSet$name]
code cell
In [123]:
#Using CEBPA as example
gr <- GRanges(seqnames = c("chr19"), ranges = IRanges(start = c(33792929), end = c(33794030)))
code cell
In [124]:
queryHits <- queryHits(findOverlaps(query = pSet, subject = gr, type = "within"))
colnames(matches)[which(assay(matches[queryHits,]))]
code cell
In [126]:
#Motif Enrichment in Marker Peaks
enrichMotifs <- peakAnnoEnrichment(
seMarker = markerPeaks,
ArchRProj = projHeme5,
peakAnnotation = "Motif",
cutOff = "FDR <= 0.1 & Log2FC >= 0.5"
)
Output:
ArchR logging to : ArchRLogs/ArchR-peakAnnoEnrichment-91a825183bce-Date-2025-10-31_Time-19-58-56.148821.log
If there is an issue, please report to github with logFile!
2025-10-31 19:59:06.337613 : Computing Enrichments 1 of 8, 0.17 mins elapsed.
2025-10-31 19:59:06.456576 : Computing Enrichments 2 of 8, 0.172 mins elapsed.
2025-10-31 19:59:06.586954 : Computing Enrichments 3 of 8, 0.174 mins elapsed.
2025-10-31 19:59:06.712253 : Computing Enrichments 4 of 8, 0.176 mins elapsed.
2025-10-31 19:59:06.840202 : Computing Enrichments 5 of 8, 0.178 mins elapsed.
2025-10-31 19:59:06.970618 : Computing Enrichments 6 of 8, 0.18 mins elapsed.
2025-10-31 19:59:07.11032 : Computing Enrichments 7 of 8, 0.183 mins elapsed.
2025-10-31 19:59:07.233069 : Computing Enrichments 8 of 8, 0.185 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-peakAnnoEnrichment-91a825183bce-Date-2025-10-31_Time-19-58-56.148821.log
code cell
In [127]:
heatmapEM <- plotEnrichHeatmap(enrichMotifs, n = 7, transpose = TRUE)
Output:
ArchR logging to : ArchRLogs/ArchR-plotEnrichHeatmap-91a82c8112e1-Date-2025-10-31_Time-19-59-07.637417.log
If there is an issue, please report to github with logFile!
Adding Annotations..
Preparing Main Heatmap..
code cell
In [128]:
#Plotting motif logos
pwm <- getPeakAnnotation(projHeme5, "Motif")$motifs[["SOX6_868"]]
pwm
Result:
An object of class PWMatrix
ID: ENSG00000110693_LINE19574_SOX6_I_N7
Name: SOX6
Matrix Class: Unknown
strand: *
Pseudocounts:
Tags:
$ensembl
[1] "ENSG00000110693"
Background:
A C G T
0.25 0.25 0.25 0.25
Matrix:
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
A 0.2856363 1.236986 1.297203 -2.131381 1.340070 1.259222 -1.224160
C -0.6196693 -2.173286 -3.473518 1.257839 -3.473518 -1.259465 -1.366327
G -0.4525775 -1.628926 -1.447733 -1.817039 -2.592395 -1.817039 -3.473518
T 0.4023151 -1.407137 -2.592395 -1.604399 -2.592395 -3.473518 1.229625
[,8]
A 0.4500618
C -0.4283456
G 0.2154626
T -0.6169845
code cell
In [129]:
PWMatrixToProbMatrix <- function(x){
if (class(x) != "PWMatrix") stop("x must be a TFBSTools::PWMatrix object")
m <- (exp(as(x, "matrix"))) * TFBSTools::bg(x)/sum(TFBSTools::bg(x))
m <- t(t(m)/colSums(m))
m
}
ppm <- PWMatrixToProbMatrix(pwm)
ppm
A matrix: 4 × 8 of type dbl
| A | 0.3326521 | 0.86130341 | 0.914762277 | 0.02966834 | 0.954827787 | 0.880670100 | 0.073501136 | 0.3921023 |
| C | 0.1345306 | 0.02845076 | 0.007751938 | 0.87945252 | 0.007751938 | 0.070951425 | 0.063760514 | 0.1628965 |
| G | 0.1589967 | 0.04903503 | 0.058775647 | 0.04062654 | 0.018710138 | 0.040626537 | 0.007751938 | 0.3101089 |
| T | 0.3738206 | 0.06121080 | 0.018710138 | 0.05025260 | 0.018710138 | 0.007751938 | 0.854986412 | 0.1348923 |
code cell
In [130]:
library(ggseqlogo)
colSums(ppm) %>% range
ggseqlogo(ppm, method = "bits")
Output:
Warning message:
“package ‘ggseqlogo’ was built under R version 4.3.3”
code cell
In [131]:
#For more customizable enrichments, see 14.2 in ArchR website
markdown cell
13. ChromVAR Deviatons Enrichment
code cell
In [142]:
if("Motif" %ni% names(projHeme5@peakAnnotation)){
projHeme5 <- addMotifAnnotations(ArchRProj = projHeme5, motifSet = "cisbp", name = "Motif")
}
code cell
In [143]:
projHeme5 <- addBgdPeaks(projHeme5)
Output:
Identifying Background Peaks!
code cell
In [144]:
projHeme5 <- addDeviationsMatrix(
ArchRProj = projHeme5,
peakAnnotation = "Motif",
force = TRUE
)
Output:
Using Previous Background Peaks!
ArchR logging to : ArchRLogs/ArchR-addDeviationsMatrix-91a83547fae9-Date-2025-10-31_Time-20-06-56.868617.log
If there is an issue, please report to github with logFile!
2025-10-31 20:07:00.761119 : Batch Execution w/ safelapply!, 0 mins elapsed.
###########
2025-10-31 20:15:20.370548 : Completed Computing Deviations!, 8.392 mins elapsed.
###########
ArchR logging successful to : ArchRLogs/ArchR-addDeviationsMatrix-91a83547fae9-Date-2025-10-31_Time-20-06-56.868617.log
code cell
In [145]:
plotVarDev <- getVarDeviations(projHeme5, name = "MotifMatrix", plot = TRUE)
Output:
DataFrame with 6 rows and 6 columns
seqnames idx name combinedVars combinedMeans rank
<Rle> <integer> <character> <numeric> <numeric> <integer>
f470 z 470 GSC_470 22.4508 -0.442743 1
f402 z 402 GSC2_402 21.3914 -0.438434 2
f560 z 560 DMBX1_560 21.2762 -0.428337 3
f426 z 426 PITX3_426 20.4690 -0.416372 4
f565 z 565 DPRX_565 19.8257 -0.419043 5
f404 z 404 PITX1_404 16.6746 -0.390313 6
code cell
In [146]:
plotVarDev
Output:
Warning message:
“ggrepel: 7 unlabeled data points (too many overlaps). Consider increasing max.overlaps”
code cell
In [147]:
motifs <- c(
"CCND1", ##RPCs
"ATOH7", ##NPCs
"GAP43", ##RGCs
"ONECUT1", ##HCs
"TFAP2A", ##ACs
"PRDM1", ##Cones
"NRL" ##Rods
)
markerMotifs <- getFeatures(projHeme5, select = paste(motifs, collapse="|"), useMatrix = "MotifMatrix")
markerMotifs
- 'z:ONECUT1_295'
- 'z:PRDM16_211'
- 'z:PRDM1_163'
- 'z:NRL_123'
- 'z:ATOH7_84'
- 'z:TFAP2A_5'
- 'deviations:ONECUT1_295'
- 'deviations:PRDM16_211'
- 'deviations:PRDM1_163'
- 'deviations:NRL_123'
- 'deviations:ATOH7_84'
- 'deviations:TFAP2A_5'
code cell
In [148]:
#Remove extra motifs
markerMotifs <- grep("z:", markerMotifs, value = TRUE)
markerMotifs <- markerMotifs[markerMotifs %ni% c("z:PRDM16_211",'z:PRDM16_211')]
markerMotifs
- 'z:ONECUT1_295'
- 'z:PRDM1_163'
- 'z:NRL_123'
- 'z:ATOH7_84'
- 'z:TFAP2A_5'
code cell
In [149]:
pD <- plotGroups(ArchRProj = projHeme5,
groupBy = "Clusters2",
colorBy = "MotifMatrix",
name = markerMotifs,
imputeWeights = getImputeWeights(projHeme5)
)
Output:
Getting ImputeWeights
No imputeWeights found, returning NULL
Getting Matrix Values...
2025-10-31 20:19:20.371884 :
1
2
3
4
5
code cell
In [152]:
pD2 <- lapply(seq_along(pD), function(x){
if(x != 1){
pD[[x]] + guides(color = "none", fill = "none") +
theme_ArchR(baseSize = 6) +
theme(plot.margin = unit(c(0.1, 0.1, 0.1, 0.1), "cm")) +
theme(
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
axis.title.y=element_blank()
) + ylab("")
}else{
pD[[x]] + guides(color = "none", fill = "none") +
theme_ArchR(baseSize = 6) +
theme(plot.margin = unit(c(0.1, 0.1, 0.1, 0.1), "cm")) +
theme(
axis.ticks.y=element_blank(),
axis.title.y=element_blank()
) + ylab("")
}
})
do.call(cowplot::plot_grid, c(list(nrow = 1, rel_widths = c(2, rep(1, length(pD2) - 1))),pD2))
Output:
Picking joint bandwidth of 0.266
Picking joint bandwidth of 0.32
Picking joint bandwidth of 0.274
Picking joint bandwidth of 0.362
Picking joint bandwidth of 0.266
code cell
In [153]:
#Plot in UMAP
pE <- plotEmbedding(
ArchRProj = projHeme5,
colorBy = "MotifMatrix",
name = sort(markerMotifs),
embedding = "UMAP",
imputeWeights = getImputeWeights(projHeme5)
)
Output:
Getting ImputeWeights
No imputeWeights found, returning NULL
ArchR logging to : ArchRLogs/ArchR-plotEmbedding-91a843902a36-Date-2025-10-31_Time-20-20-00.828931.log
If there is an issue, please report to github with logFile!
Getting UMAP Embedding
ColorBy = MotifMatrix
Getting Matrix Values...
2025-10-31 20:20:01.524164 :
Plotting Embedding
1
2
3
4
5
ArchR logging successful to : ArchRLogs/ArchR-plotEmbedding-91a843902a36-Date-2025-10-31_Time-20-20-00.828931.log
code cell
In [154]:
pE2 <- lapply(pE, function(x){
x + guides(color = "none", fill = "none") +
theme_ArchR(baseSize = 6.5) +
theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) +
theme(
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank()
)
})
do.call(cowplot::plot_grid, c(list(ncol = 3),pE2))
code cell
#Plot by gene expression scores
markerGS <- getFeatures(projHeme5, select = paste(motifs, collapse="|"), useMatrix = "GeneScoreMatrix")
markerGS <- markerGS[markerGS %in% motifs]
markerGS
- 'ATOH7'
- 'CCND1'
- 'NRL'
- 'ONECUT1'
- 'GAP43'
- 'TFAP2A'
- 'PRDM1'
code cell
pG <- plotEmbedding(
ArchRProj = projHeme5,
colorBy = "GeneScoreMatrix",
name = sort(markerGS),
embedding = "UMAP",
imputeWeights = getImputeWeights(projHeme5)
)
code cell
pG2 <- lapply(pG, function(x){
x + guides(color = "none", fill = "none") +
theme_ArchR(baseSize = 6.5) +
theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) +
theme(
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank()
)
})
do.call(cowplot::plot_grid, c(list(ncol = 3),pG2))
code cell
#Plot by linked gene expression
markerRNA <- getFeatures(projHeme5, select = paste(motifs, collapse="|"), useMatrix = "GeneIntegrationMatrix")
markerRNA <- markerRNA[markerRNA %in% motifs]
markerRNA
code cell
pl <- plotEmbedding(
ArchRProj = projHeme5,
colorBy = "GeneIntegrationMatrix",
name = sort(markerRNA),
embedding = "UMAP",
continuousSet = "blueYellow",
imputeWeights = getImputeWeights(projHeme5)
)
code cell
pl2 <- lapply(pl, function(x){
x + guides(color = "none", fill = "none") +
theme_ArchR(baseSize = 6.5) +
theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) +
theme(
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank()
)
})
do.call(cowplot::plot_grid, c(list(ncol = 3),pl2))
markdown cell
14. Integrative Analysis
code cell
In [132]:
projHeme5 <- addCoAccessibility(
ArchRProj = projHeme5,
reducedDims = "IterativeLSI"
)
Output:
ArchR logging to : ArchRLogs/ArchR-addCoAccessibility-91a8736e9148-Date-2025-10-31_Time-19-59-18.421382.log
If there is an issue, please report to github with logFile!
2025-10-31 19:59:19.637594 : Computing KNN, 0.02 mins elapsed.
2025-10-31 19:59:19.764566 : Identifying Non-Overlapping KNN pairs, 0.022 mins elapsed.
2025-10-31 19:59:22.033373 : Identified 475 Groupings!, 0.06 mins elapsed.
2025-10-31 19:59:27.011613 : Computing Co-Accessibility chr1 (1 of 23), 0.143 mins elapsed.
2025-10-31 19:59:33.336164 : Computing Co-Accessibility chr2 (2 of 23), 0.249 mins elapsed.
2025-10-31 19:59:38.255038 : Computing Co-Accessibility chr3 (3 of 23), 0.331 mins elapsed.
2025-10-31 19:59:42.666384 : Computing Co-Accessibility chr4 (4 of 23), 0.404 mins elapsed.
2025-10-31 19:59:46.506188 : Computing Co-Accessibility chr5 (5 of 23), 0.468 mins elapsed.
2025-10-31 19:59:50.578756 : Computing Co-Accessibility chr6 (6 of 23), 0.536 mins elapsed.
2025-10-31 19:59:54.922091 : Computing Co-Accessibility chr7 (7 of 23), 0.608 mins elapsed.
2025-10-31 19:59:59.131901 : Computing Co-Accessibility chr8 (8 of 23), 0.679 mins elapsed.
2025-10-31 20:00:03.037311 : Computing Co-Accessibility chr9 (9 of 23), 0.744 mins elapsed.
2025-10-31 20:00:07.09436 : Computing Co-Accessibility chr10 (10 of 23), 0.811 mins elapsed.
2025-10-31 20:00:11.137756 : Computing Co-Accessibility chr11 (11 of 23), 0.879 mins elapsed.
2025-10-31 20:00:15.627425 : Computing Co-Accessibility chr12 (12 of 23), 0.953 mins elapsed.
2025-10-31 20:00:19.890201 : Computing Co-Accessibility chr13 (13 of 23), 1.024 mins elapsed.
2025-10-31 20:00:23.276465 : Computing Co-Accessibility chr14 (14 of 23), 1.081 mins elapsed.
2025-10-31 20:00:27.039278 : Computing Co-Accessibility chr15 (15 of 23), 1.144 mins elapsed.
2025-10-31 20:00:30.757071 : Computing Co-Accessibility chr16 (16 of 23), 1.206 mins elapsed.
2025-10-31 20:00:34.730788 : Computing Co-Accessibility chr17 (17 of 23), 1.272 mins elapsed.
2025-10-31 20:00:39.286797 : Computing Co-Accessibility chr18 (18 of 23), 1.348 mins elapsed.
2025-10-31 20:00:42.558676 : Computing Co-Accessibility chr19 (19 of 23), 1.402 mins elapsed.
2025-10-31 20:00:47.111523 : Computing Co-Accessibility chr20 (20 of 23), 1.478 mins elapsed.
2025-10-31 20:00:50.670185 : Computing Co-Accessibility chr21 (21 of 23), 1.537 mins elapsed.
2025-10-31 20:00:53.734493 : Computing Co-Accessibility chr22 (22 of 23), 1.589 mins elapsed.
2025-10-31 20:00:57.203727 : Computing Co-Accessibility chrX (23 of 23), 1.646 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-addCoAccessibility-91a8736e9148-Date-2025-10-31_Time-19-59-18.421382.log
code cell
In [133]:
cA <- getCoAccessibility(
ArchRProj = projHeme5,
corCutOff = 0.5,
resolution = 1,
returnLoops = FALSE
)
code cell
In [134]:
cA
Result:
DataFrame with 166850 rows and 11 columns
queryHits subjectHits seqnames correlation Variability1 Variability2
<integer> <integer> <Rle> <numeric> <numeric> <numeric>
1 5 22 chr1 0.605403 0.00646632 0.08812707
2 5 24 chr1 0.591242 0.00646632 0.01572732
3 8 30 chr1 0.557246 0.00705825 0.00453338
4 8 42 chr1 0.559279 0.00705825 0.00917213
5 8 53 chr1 0.606811 0.00705825 0.03785115
... ... ... ... ... ... ...
166846 129068 129062 chrX 0.548497 0.000829654 0.000494894
166847 129068 129071 chrX 0.625279 0.000829654 0.000993719
166848 129071 129058 chrX 0.632109 0.000993719 0.001208354
166849 129071 129062 chrX 0.623128 0.000993719 0.000494894
166850 129071 129068 chrX 0.625279 0.000993719 0.000829654
TStat Pval FDR VarQuantile1 VarQuantile2
<numeric> <numeric> <numeric> <numeric> <numeric>
1 16.5427 7.79188e-49 2.11657e-47 0.786511 0.999423
2 15.9439 4.10395e-46 9.98569e-45 0.786511 0.939097
3 14.5954 4.22879e-40 7.99728e-39 0.805353 0.708505
4 14.6729 1.93124e-40 3.70914e-39 0.805353 0.857611
5 16.6036 4.10729e-49 1.12745e-47 0.805353 0.990253
... ... ... ... ... ...
166846 14.2666 1.15875e-38 2.05649e-37 0.239415 0.115521
166847 17.4256 6.81007e-53 2.16212e-51 0.239415 0.289439
166848 17.7414 2.34308e-54 7.84521e-53 0.289439 0.344420
166849 17.3275 1.93406e-52 6.03719e-51 0.289439 0.115521
166850 17.4256 6.81007e-53 2.16212e-51 0.289439 0.239415
code cell
In [136]:
cA <- getCoAccessibility(
ArchRProj = projHeme5,
corCutOff = 0.5,
resolution = 1,
returnLoops = TRUE
)
code cell
In [138]:
cA[[1]]
Result:
GRanges object with 83425 ranges and 9 metadata columns:
seqnames ranges strand | correlation Variability1
<Rle> <IRanges> <Rle> | <numeric> <numeric>
[1] chr1 869931-933646 * | 0.605403 0.00646632
[2] chr1 869931-936022 * | 0.591242 0.00646632
[3] chr1 905435-940961 * | 0.557246 0.00705825
[4] chr1 905435-967007 * | 0.559279 0.00705825
[5] chr1 905435-1000160 * | 0.606811 0.00705825
... ... ... ... . ... ...
[83421] chrX 154730383-154735921 * | 0.556769 0.005865527
[83422] chrX 154730383-154738200 * | 0.629392 0.006515845
[83423] chrX 154730902-154762145 * | 0.548497 0.000494894
[83424] chrX 154730902-154799790 * | 0.623128 0.000494894
[83425] chrX 154762145-154799790 * | 0.625279 0.000829654
Variability2 TStat Pval FDR VarQuantile1
<numeric> <numeric> <numeric> <numeric> <numeric>
[1] 0.08812707 16.5427 7.79188e-49 2.11657e-47 0.786511
[2] 0.01572732 15.9439 4.10395e-46 9.98569e-45 0.786511
[3] 0.00453338 14.5954 4.22879e-40 7.99728e-39 0.805353
[4] 0.00917213 14.6729 1.93124e-40 3.70914e-39 0.805353
[5] 0.03785115 16.6036 4.10729e-49 1.12745e-47 0.805353
... ... ... ... ... ...
[83421] 0.001167913 14.5773 5.07777e-40 9.57071e-39 0.766055
[83422] 0.005865527 17.6150 9.04491e-54 2.96547e-52 0.788074
[83423] 0.000829654 14.2666 1.15875e-38 2.05649e-37 0.115521
[83424] 0.000993719 17.3275 1.93406e-52 6.03719e-51 0.115521
[83425] 0.000993719 17.4256 6.81007e-53 2.16212e-51 0.239415
VarQuantile2 value
<numeric> <numeric>
[1] 0.999423 0.605403
[2] 0.939097 0.591242
[3] 0.708505 0.557246
[4] 0.857611 0.559279
[5] 0.990253 0.606811
... ... ...
[83421] 0.334468 0.556769
[83422] 0.766055 0.629392
[83423] 0.239415 0.548497
[83424] 0.289439 0.623128
[83425] 0.289439 0.625279
-------
seqinfo: 23 sequences from an unspecified genome; no seqlengths
code cell
In [139]:
cALoops <- cA[[1]]
cALoops <- cALoops[cALoops$FDR < 10^-10]
cALoops <- cALoops[rowMins(cbind(cALoops$VarQuantile1,cALoops$VarQuantile2)) > 0.35]
cALoops
Result:
GRanges object with 51208 ranges and 9 metadata columns:
seqnames ranges strand | correlation Variability1
<Rle> <IRanges> <Rle> | <numeric> <numeric>
[1] chr1 869931-933646 * | 0.605403 0.00646632
[2] chr1 869931-936022 * | 0.591242 0.00646632
[3] chr1 905435-940961 * | 0.557246 0.00705825
[4] chr1 905435-967007 * | 0.559279 0.00705825
[5] chr1 905435-1000160 * | 0.606811 0.00705825
... ... ... ... . ... ...
[51204] chrX 154460076-154490796 * | 0.521831 0.00184088
[51205] chrX 154460076-154541392 * | 0.597480 0.00534552
[51206] chrX 154490796-154541392 * | 0.594713 0.01306984
[51207] chrX 154728098-154738200 * | 0.529508 0.00214815
[51208] chrX 154730383-154738200 * | 0.629392 0.00651584
Variability2 TStat Pval FDR VarQuantile1
<numeric> <numeric> <numeric> <numeric> <numeric>
[1] 0.08812707 16.5427 7.79188e-49 2.11657e-47 0.786511
[2] 0.01572732 15.9439 4.10395e-46 9.98569e-45 0.786511
[3] 0.00453338 14.5954 4.22879e-40 7.99728e-39 0.805353
[4] 0.00917213 14.6729 1.93124e-40 3.70914e-39 0.805353
[5] 0.03785115 16.6036 4.10729e-49 1.12745e-47 0.805353
... ... ... ... ... ...
[51204] 0.01306984 13.3041 1.56668e-34 2.31273e-33 0.464842
[51205] 0.00184088 16.2048 2.69772e-47 6.88544e-46 0.746096
[51206] 0.00534552 16.0885 9.09022e-47 2.26934e-45 0.916135
[51207] 0.00651584 13.5754 1.10446e-35 1.71740e-34 0.509330
[51208] 0.00586553 17.6150 9.04491e-54 2.96547e-52 0.788074
VarQuantile2 value
<numeric> <numeric>
[1] 0.999423 0.605403
[2] 0.939097 0.591242
[3] 0.708505 0.557246
[4] 0.857611 0.559279
[5] 0.990253 0.606811
... ... ...
[51204] 0.916135 0.521831
[51205] 0.464842 0.597480
[51206] 0.746096 0.594713
[51207] 0.788074 0.529508
[51208] 0.766055 0.629392
-------
seqinfo: 23 sequences from an unspecified genome; no seqlengths
code cell
In [140]:
p17 <- plotBrowserTrack(
ArchRProj = projHeme5,
groupBy = "Clusters2",
geneSymbol = markerGenes,
upstream = 50000,
downstream = 50000,
loops = getCoAccessibility(projHeme5)
)
Output:
ArchR logging to : ArchRLogs/ArchR-plotBrowserTrack-91a826687ad8-Date-2025-10-31_Time-20-05-42.333402.log
If there is an issue, please report to github with logFile!
2025-10-31 20:05:42.57646 : Validating Region, 0.004 mins elapsed.
Output:
GRanges object with 7 ranges and 2 metadata columns:
seqnames ranges strand | gene_id symbol
<Rle> <IRanges> <Rle> | <character> <character>
[1] chr11 69641087-69654474 + | 595 CCND1
[2] chr10 68230624-68232103 - | 220202 ATOH7
[3] chr3 115623324-115721490 + | 2596 GAP43
[4] chr15 52756989-52791078 - | 3175 ONECUT1
[5] chr6 10393186-10419659 - | 7020 TFAP2A
[6] chr6 106086320-106109939 + | 639 PRDM1
[7] chr14 24080107-24115014 - | 4901 NRL
-------
seqinfo: 24 sequences from hg38 genome
Output:
2025-10-31 20:05:42.64059 : Adding Bulk Tracks (1 of 7), 0.005 mins elapsed.
2025-10-31 20:05:43.824628 : Adding Feature Tracks (1 of 7), 0.025 mins elapsed.
2025-10-31 20:05:43.888833 : Adding Loop Tracks (1 of 7), 0.026 mins elapsed.
2025-10-31 20:05:44.191854 : Adding Gene Tracks (1 of 7), 0.031 mins elapsed.
2025-10-31 20:05:44.418705 : Plotting, 0.035 mins elapsed.
2025-10-31 20:05:45.343415 : Adding Bulk Tracks (2 of 7), 0.05 mins elapsed.
2025-10-31 20:05:46.402727 : Adding Feature Tracks (2 of 7), 0.068 mins elapsed.
2025-10-31 20:05:46.466118 : Adding Loop Tracks (2 of 7), 0.069 mins elapsed.
2025-10-31 20:05:46.623292 : Adding Gene Tracks (2 of 7), 0.072 mins elapsed.
2025-10-31 20:05:46.85263 : Plotting, 0.075 mins elapsed.
2025-10-31 20:05:47.710006 : Adding Bulk Tracks (3 of 7), 0.09 mins elapsed.
2025-10-31 20:05:48.892102 : Adding Feature Tracks (3 of 7), 0.109 mins elapsed.
2025-10-31 20:05:48.949689 : Adding Loop Tracks (3 of 7), 0.11 mins elapsed.
2025-10-31 20:05:48.986819 : Adding Gene Tracks (3 of 7), 0.111 mins elapsed.
2025-10-31 20:05:49.211452 : Plotting, 0.115 mins elapsed.
2025-10-31 20:05:50.044354 : Adding Bulk Tracks (4 of 7), 0.129 mins elapsed.
2025-10-31 20:05:50.997637 : Adding Feature Tracks (4 of 7), 0.144 mins elapsed.
2025-10-31 20:05:51.059533 : Adding Loop Tracks (4 of 7), 0.145 mins elapsed.
2025-10-31 20:05:51.143282 : Adding Gene Tracks (4 of 7), 0.147 mins elapsed.
2025-10-31 20:05:51.368527 : Plotting, 0.151 mins elapsed.
2025-10-31 20:05:52.161704 : Adding Bulk Tracks (5 of 7), 0.164 mins elapsed.
2025-10-31 20:05:53.374133 : Adding Feature Tracks (5 of 7), 0.184 mins elapsed.
2025-10-31 20:05:53.439228 : Adding Loop Tracks (5 of 7), 0.185 mins elapsed.
2025-10-31 20:05:53.7358 : Adding Gene Tracks (5 of 7), 0.19 mins elapsed.
2025-10-31 20:05:53.986343 : Plotting, 0.194 mins elapsed.
2025-10-31 20:05:54.981452 : Adding Bulk Tracks (6 of 7), 0.211 mins elapsed.
2025-10-31 20:05:56.207274 : Adding Feature Tracks (6 of 7), 0.231 mins elapsed.
2025-10-31 20:05:56.266981 : Adding Loop Tracks (6 of 7), 0.232 mins elapsed.
2025-10-31 20:05:57.065695 : Adding Gene Tracks (6 of 7), 0.246 mins elapsed.
2025-10-31 20:05:57.28871 : Plotting, 0.249 mins elapsed.
2025-10-31 20:05:58.116256 : Adding Bulk Tracks (7 of 7), 0.263 mins elapsed.
2025-10-31 20:05:59.111555 : Adding Feature Tracks (7 of 7), 0.28 mins elapsed.
2025-10-31 20:05:59.175051 : Adding Loop Tracks (7 of 7), 0.281 mins elapsed.
2025-10-31 20:06:00.651995 : Adding Gene Tracks (7 of 7), 0.305 mins elapsed.
2025-10-31 20:06:00.875915 : Plotting, 0.309 mins elapsed.
ArchR logging successful to : ArchRLogs/ArchR-plotBrowserTrack-91a826687ad8-Date-2025-10-31_Time-20-05-42.333402.log
code cell
In [141]:
grid::grid.draw(p17$CCND1)
code cell
projHeme5 <- saveArchRProject(ArchRProj = projHeme5, outputDirectory = "Save-ProjHeme5", load = TRUE)